Skip to content

Commit 3dd7599

Browse files
committed
ENH: Add ability to use special characters for column names in query function.
Clean up in the code that is used for using spaces in the query function and extending the ability to also use special characters that are not allowed in python identifiers. All files related to this functionality are now in the pandas/core/computation/parsing.py file.
1 parent 76ddd78 commit 3dd7599

File tree

229 files changed

+2369
-7855
lines changed

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

229 files changed

+2369
-7855
lines changed

.travis.yml

+19-16
Original file line numberDiff line numberDiff line change
@@ -30,31 +30,34 @@ matrix:
3030
- python: 3.5
3131

3232
include:
33-
- dist: trusty
34-
env:
33+
- env:
3534
- JOB="3.8" ENV_FILE="ci/deps/travis-38.yaml" PATTERN="(not slow and not network)"
3635

37-
- dist: trusty
38-
env:
36+
- env:
3937
- JOB="3.7" ENV_FILE="ci/deps/travis-37.yaml" PATTERN="(not slow and not network)"
4038

41-
- dist: trusty
42-
env:
43-
- JOB="3.6, locale" ENV_FILE="ci/deps/travis-36-locale.yaml" PATTERN="((not slow and not network) or (single and db))" LOCALE_OVERRIDE="zh_CN.UTF-8"
39+
- env:
40+
- JOB="3.6, locale" ENV_FILE="ci/deps/travis-36-locale.yaml" PATTERN="((not slow and not network) or (single and db))" LOCALE_OVERRIDE="zh_CN.UTF-8" SQL="1"
41+
services:
42+
- mysql
43+
- postgresql
4444

45-
- dist: trusty
46-
env:
47-
- JOB="3.6, coverage" ENV_FILE="ci/deps/travis-36-cov.yaml" PATTERN="((not slow and not network) or (single and db))" PANDAS_TESTING_MODE="deprecate" COVERAGE=true
45+
- env:
46+
- JOB="3.6, coverage" ENV_FILE="ci/deps/travis-36-cov.yaml" PATTERN="((not slow and not network) or (single and db))" PANDAS_TESTING_MODE="deprecate" COVERAGE=true SQL="1"
47+
services:
48+
- mysql
49+
- postgresql
4850

4951
# In allow_failures
50-
- dist: trusty
51-
env:
52-
- JOB="3.6, slow" ENV_FILE="ci/deps/travis-36-slow.yaml" PATTERN="slow"
52+
- env:
53+
- JOB="3.6, slow" ENV_FILE="ci/deps/travis-36-slow.yaml" PATTERN="slow" SQL="1"
54+
services:
55+
- mysql
56+
- postgresql
5357

5458
allow_failures:
55-
- dist: trusty
56-
env:
57-
- JOB="3.6, slow" ENV_FILE="ci/deps/travis-36-slow.yaml" PATTERN="slow"
59+
- env:
60+
- JOB="3.6, slow" ENV_FILE="ci/deps/travis-36-slow.yaml" PATTERN="slow" SQL="1"
5861

5962
before_install:
6063
- echo "before_install"

LICENSES/MSGPACK_LICENSE

-13
This file was deleted.

LICENSES/MSGPACK_NUMPY_LICENSE

-33
This file was deleted.

MANIFEST.in

-1
Original file line numberDiff line numberDiff line change
@@ -20,7 +20,6 @@ global-exclude *.gz
2020
global-exclude *.h5
2121
global-exclude *.html
2222
global-exclude *.json
23-
global-exclude *.msgpack
2423
global-exclude *.pickle
2524
global-exclude *.png
2625
global-exclude *.pyc

README.md

+1-1
Original file line numberDiff line numberDiff line change
@@ -124,7 +124,7 @@ Here are just a few of the things that pandas does well:
124124
and saving/loading data from the ultrafast [**HDF5 format**][hdfstore]
125125
- [**Time series**][timeseries]-specific functionality: date range
126126
generation and frequency conversion, moving window statistics,
127-
moving window linear regressions, date shifting and lagging, etc.
127+
date shifting and lagging.
128128

129129

130130
[missing-data]: https://pandas.pydata.org/pandas-docs/stable/missing_data.html#working-with-missing-data

asv_bench/benchmarks/io/msgpack.py

-32
This file was deleted.

asv_bench/benchmarks/io/sas.py

+1-1
Original file line numberDiff line numberDiff line change
@@ -26,5 +26,5 @@ def setup(self, format):
2626
]
2727
self.f = os.path.join(*paths)
2828

29-
def time_read_msgpack(self, format):
29+
def time_read_sas(self, format):
3030
read_sas(self.f, format=format)

ci/azure/posix.yml

+7-9
Original file line numberDiff line numberDiff line change
@@ -44,15 +44,13 @@ jobs:
4444
PATTERN: "not slow and not network"
4545
LOCALE_OVERRIDE: "zh_CN.UTF-8"
4646

47-
# Disabled for NumPy object-dtype warning.
48-
# https://github.com/pandas-dev/pandas/issues/30043
49-
# py37_np_dev:
50-
# ENV_FILE: ci/deps/azure-37-numpydev.yaml
51-
# CONDA_PY: "37"
52-
# PATTERN: "not slow and not network"
53-
# TEST_ARGS: "-W error"
54-
# PANDAS_TESTING_MODE: "deprecate"
55-
# EXTRA_APT: "xsel"
47+
py37_np_dev:
48+
ENV_FILE: ci/deps/azure-37-numpydev.yaml
49+
CONDA_PY: "37"
50+
PATTERN: "not slow and not network"
51+
TEST_ARGS: "-W error"
52+
PANDAS_TESTING_MODE: "deprecate"
53+
EXTRA_APT: "xsel"
5654

5755
steps:
5856
- script: |

ci/code_checks.sh

+3-3
Original file line numberDiff line numberDiff line change
@@ -39,7 +39,7 @@ function invgrep {
3939
}
4040

4141
if [[ "$GITHUB_ACTIONS" == "true" ]]; then
42-
FLAKE8_FORMAT="##[error]%(path)s:%(row)s:%(col)s:%(code):%(text)s"
42+
FLAKE8_FORMAT="##[error]%(path)s:%(row)s:%(col)s:%(code)s:%(text)s"
4343
INVGREP_PREPEND="##[error]"
4444
else
4545
FLAKE8_FORMAT="default"
@@ -94,10 +94,10 @@ if [[ -z "$CHECK" || "$CHECK" == "lint" ]]; then
9494

9595
# We don't lint all C files because we don't want to lint any that are built
9696
# from Cython files nor do we want to lint C files that we didn't modify for
97-
# this particular codebase (e.g. src/headers, src/klib, src/msgpack). However,
97+
# this particular codebase (e.g. src/headers, src/klib). However,
9898
# we can lint all header files since they aren't "generated" like C files are.
9999
MSG='Linting .c and .h' ; echo $MSG
100-
cpplint --quiet --extensions=c,h --headers=h --recursive --filter=-readability/casting,-runtime/int,-build/include_subdir pandas/_libs/src/*.h pandas/_libs/src/parser pandas/_libs/ujson pandas/_libs/tslibs/src/datetime pandas/io/msgpack pandas/_libs/*.cpp pandas/util
100+
cpplint --quiet --extensions=c,h --headers=h --recursive --filter=-readability/casting,-runtime/int,-build/include_subdir pandas/_libs/src/*.h pandas/_libs/src/parser pandas/_libs/ujson pandas/_libs/tslibs/src/datetime pandas/_libs/*.cpp
101101
RET=$(($RET + $?)) ; echo $MSG "DONE"
102102

103103
echo "isort --version-number"

ci/setup_env.sh

+3-2
Original file line numberDiff line numberDiff line change
@@ -121,7 +121,7 @@ conda list pandas
121121
# Make sure any error below is reported as such
122122

123123
echo "[Build extensions]"
124-
python setup.py build_ext -q -i -j4
124+
python setup.py build_ext -q -i
125125

126126
# XXX: Some of our environments end up with old versions of pip (10.x)
127127
# Adding a new enough version of pip to the requirements explodes the
@@ -140,7 +140,8 @@ echo "conda list"
140140
conda list
141141

142142
# Install DB for Linux
143-
if [ "${TRAVIS_OS_NAME}" == "linux" ]; then
143+
144+
if [[ -n ${SQL:0} ]]; then
144145
echo "installing dbs"
145146
mysql -e 'create database pandas_nosetest;'
146147
psql -c 'create database pandas_nosetest;' -U postgres

doc/redirects.csv

+1-4
Original file line numberDiff line numberDiff line change
@@ -491,7 +491,6 @@ generated/pandas.DataFrame.to_hdf,../reference/api/pandas.DataFrame.to_hdf
491491
generated/pandas.DataFrame.to,../reference/api/pandas.DataFrame.to
492492
generated/pandas.DataFrame.to_json,../reference/api/pandas.DataFrame.to_json
493493
generated/pandas.DataFrame.to_latex,../reference/api/pandas.DataFrame.to_latex
494-
generated/pandas.DataFrame.to_msgpack,../reference/api/pandas.DataFrame.to_msgpack
495494
generated/pandas.DataFrame.to_numpy,../reference/api/pandas.DataFrame.to_numpy
496495
generated/pandas.DataFrame.to_panel,../reference/api/pandas.DataFrame.to_panel
497496
generated/pandas.DataFrame.to_parquet,../reference/api/pandas.DataFrame.to_parquet
@@ -778,7 +777,7 @@ generated/pandas.io.formats.style.Styler.to_excel,../reference/api/pandas.io.for
778777
generated/pandas.io.formats.style.Styler.use,../reference/api/pandas.io.formats.style.Styler.use
779778
generated/pandas.io.formats.style.Styler.where,../reference/api/pandas.io.formats.style.Styler.where
780779
generated/pandas.io.json.build_table_schema,../reference/api/pandas.io.json.build_table_schema
781-
generated/pandas.io.json.json_normalize,../reference/api/pandas.io.json.json_normalize
780+
generated/pandas.io.json.json_normalize,../reference/api/pandas.json_normalize
782781
generated/pandas.io.stata.StataReader.data_label,../reference/api/pandas.io.stata.StataReader.data_label
783782
generated/pandas.io.stata.StataReader.value_labels,../reference/api/pandas.io.stata.StataReader.value_labels
784783
generated/pandas.io.stata.StataReader.variable_labels,../reference/api/pandas.io.stata.StataReader.variable_labels
@@ -889,7 +888,6 @@ generated/pandas.read_gbq,../reference/api/pandas.read_gbq
889888
generated/pandas.read_hdf,../reference/api/pandas.read_hdf
890889
generated/pandas.read,../reference/api/pandas.read
891890
generated/pandas.read_json,../reference/api/pandas.read_json
892-
generated/pandas.read_msgpack,../reference/api/pandas.read_msgpack
893891
generated/pandas.read_parquet,../reference/api/pandas.read_parquet
894892
generated/pandas.read_pickle,../reference/api/pandas.read_pickle
895893
generated/pandas.read_sas,../reference/api/pandas.read_sas
@@ -1230,7 +1228,6 @@ generated/pandas.Series.to_json,../reference/api/pandas.Series.to_json
12301228
generated/pandas.Series.to_latex,../reference/api/pandas.Series.to_latex
12311229
generated/pandas.Series.to_list,../reference/api/pandas.Series.to_list
12321230
generated/pandas.Series.tolist,../reference/api/pandas.Series.tolist
1233-
generated/pandas.Series.to_msgpack,../reference/api/pandas.Series.to_msgpack
12341231
generated/pandas.Series.to_numpy,../reference/api/pandas.Series.to_numpy
12351232
generated/pandas.Series.to_period,../reference/api/pandas.Series.to_period
12361233
generated/pandas.Series.to_pickle,../reference/api/pandas.Series.to_pickle

doc/source/development/developer.rst

-1
Original file line numberDiff line numberDiff line change
@@ -125,7 +125,6 @@ The ``metadata`` field is ``None`` except for:
125125
in ``BYTE_ARRAY`` Parquet columns. The encoding can be one of:
126126

127127
* ``'pickle'``
128-
* ``'msgpack'``
129128
* ``'bson'``
130129
* ``'json'``
131130

doc/source/getting_started/install.rst

+2-2
Original file line numberDiff line numberDiff line change
@@ -249,7 +249,7 @@ PyTables 3.4.2 HDF5-based reading / writing
249249
SQLAlchemy 1.1.4 SQL support for databases other than sqlite
250250
SciPy 0.19.0 Miscellaneous statistical functions
251251
XLsxWriter 0.9.8 Excel writing
252-
blosc Compression for msgpack
252+
blosc Compression for HDF5
253253
fastparquet 0.3.2 Parquet reading / writing
254254
gcsfs 0.2.2 Google Cloud Storage access
255255
html5lib HTML parser for read_html (see :ref:`note <optional_html>`)
@@ -269,7 +269,7 @@ xclip Clipboard I/O on linux
269269
xlrd 1.1.0 Excel reading
270270
xlwt 1.2.0 Excel writing
271271
xsel Clipboard I/O on linux
272-
zlib Compression for msgpack
272+
zlib Compression for HDF5
273273
========================= ================== =============================================================
274274

275275
.. _optional_html:

doc/source/getting_started/overview.rst

+1-2
Original file line numberDiff line numberDiff line change
@@ -57,8 +57,7 @@ Here are just a few of the things that pandas does well:
5757
Excel files, databases, and saving / loading data from the ultrafast **HDF5
5858
format**
5959
- **Time series**-specific functionality: date range generation and frequency
60-
conversion, moving window statistics, moving window linear regressions,
61-
date shifting and lagging, etc.
60+
conversion, moving window statistics, date shifting and lagging.
6261

6362
Many of these principles are here to address the shortcomings frequently
6463
experienced using other languages / scientific research environments. For data

doc/source/reference/frame.rst

-1
Original file line numberDiff line numberDiff line change
@@ -357,7 +357,6 @@ Serialization / IO / conversion
357357
DataFrame.to_feather
358358
DataFrame.to_latex
359359
DataFrame.to_stata
360-
DataFrame.to_msgpack
361360
DataFrame.to_gbq
362361
DataFrame.to_records
363362
DataFrame.to_string

doc/source/reference/indexing.rst

+1
Original file line numberDiff line numberDiff line change
@@ -313,6 +313,7 @@ MultiIndex selecting
313313
:toctree: api/
314314

315315
MultiIndex.get_loc
316+
MultiIndex.get_locs
316317
MultiIndex.get_loc_level
317318
MultiIndex.get_indexer
318319
MultiIndex.get_level_values

doc/source/reference/io.rst

+1-2
Original file line numberDiff line numberDiff line change
@@ -22,7 +22,6 @@ Flat file
2222
read_table
2323
read_csv
2424
read_fwf
25-
read_msgpack
2625

2726
Clipboard
2827
~~~~~~~~~
@@ -51,13 +50,13 @@ JSON
5150
:toctree: api/
5251

5352
read_json
53+
json_normalize
5454

5555
.. currentmodule:: pandas.io.json
5656

5757
.. autosummary::
5858
:toctree: api/
5959

60-
json_normalize
6160
build_table_schema
6261

6362
.. currentmodule:: pandas

doc/source/reference/series.rst

-1
Original file line numberDiff line numberDiff line change
@@ -574,7 +574,6 @@ Serialization / IO / conversion
574574
Series.to_xarray
575575
Series.to_hdf
576576
Series.to_sql
577-
Series.to_msgpack
578577
Series.to_json
579578
Series.to_string
580579
Series.to_clipboard

doc/source/user_guide/cookbook.rst

+1-1
Original file line numberDiff line numberDiff line change
@@ -1229,7 +1229,7 @@ in the frame:
12291229
The offsets of the structure elements may be different depending on the
12301230
architecture of the machine on which the file was created. Using a raw
12311231
binary file format like this for general data storage is not recommended, as
1232-
it is not cross platform. We recommended either HDF5 or msgpack, both of
1232+
it is not cross platform. We recommended either HDF5 or parquet, both of
12331233
which are supported by pandas' IO facilities.
12341234

12351235
Computation

0 commit comments

Comments
 (0)