Skip to content

Commit c68f5fb

Browse files
Merge remote-tracking branch 'upstream/master' into bisect
2 parents da614e9 + c135afe commit c68f5fb

File tree

170 files changed

+2709
-1524
lines changed

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

170 files changed

+2709
-1524
lines changed

.github/CODE_OF_CONDUCT.md

-1
Original file line numberDiff line numberDiff line change
@@ -60,4 +60,3 @@ and the [Swift Code of Conduct][swift].
6060
[homepage]: https://www.contributor-covenant.org
6161
[version]: https://www.contributor-covenant.org/version/1/3/0/
6262
[swift]: https://swift.org/community/#code-of-conduct
63-

.pre-commit-config.yaml

+10-2
Original file line numberDiff line numberDiff line change
@@ -21,10 +21,12 @@ repos:
2121
- file
2222
args: [--append-config=flake8/cython-template.cfg]
2323
- repo: https://github.com/PyCQA/isort
24-
rev: 5.2.2
24+
rev: 5.6.0
2525
hooks:
2626
- id: isort
2727
exclude: ^pandas/__init__\.py$|^pandas/core/api\.py$
28+
files: '.pxd$|.py$'
29+
types: [file]
2830
- repo: https://github.com/asottile/pyupgrade
2931
rev: v2.7.2
3032
hooks:
@@ -39,11 +41,17 @@ repos:
3941
- id: pip_to_conda
4042
name: Generate pip dependency from conda
4143
description: This hook checks if the conda environment.yml and requirements-dev.txt are equal
42-
language: system
44+
language: python
4345
entry: python -m scripts.generate_pip_deps_from_conda
4446
files: ^(environment.yml|requirements-dev.txt)$
4547
pass_filenames: false
48+
additional_dependencies: [pyyaml]
4649
- repo: https://github.com/asottile/yesqa
4750
rev: v1.2.2
4851
hooks:
4952
- id: yesqa
53+
- repo: https://github.com/pre-commit/pre-commit-hooks
54+
rev: v3.2.0
55+
hooks:
56+
- id: end-of-file-fixer
57+
exclude: '.html$|^LICENSES/|.csv$|.txt$|.svg$|.py$'

.travis.yml

+4-4
Original file line numberDiff line numberDiff line change
@@ -41,10 +41,10 @@ matrix:
4141
- JOB="3.9-dev" PATTERN="(not slow and not network and not clipboard)"
4242

4343
- env:
44-
- JOB="3.8" ENV_FILE="ci/deps/travis-38.yaml" PATTERN="(not slow and not network and not clipboard)"
45-
46-
- env:
47-
- JOB="3.7" ENV_FILE="ci/deps/travis-37.yaml" PATTERN="(not slow and not network and not clipboard)"
44+
- JOB="3.8, slow" ENV_FILE="ci/deps/travis-38-slow.yaml" PATTERN="slow" SQL="1"
45+
services:
46+
- mysql
47+
- postgresql
4848

4949
- env:
5050
- JOB="3.7, locale" ENV_FILE="ci/deps/travis-37-locale.yaml" PATTERN="((not slow and not network and not clipboard) or (single and db))" LOCALE_OVERRIDE="zh_CN.UTF-8" SQL="1"

AUTHORS.md

-1
Original file line numberDiff line numberDiff line change
@@ -54,4 +54,3 @@ pandas is distributed under a 3-clause ("Simplified" or "New") BSD
5454
license. Parts of NumPy, SciPy, numpydoc, bottleneck, which all have
5555
BSD-compatible licenses, are included. Their licenses follow the pandas
5656
license.
57-

ci/azure/posix.yml

+14-18
Original file line numberDiff line numberDiff line change
@@ -20,39 +20,35 @@ jobs:
2020
CONDA_PY: "37"
2121
PATTERN: "not slow and not network and not clipboard"
2222

23+
py37:
24+
ENV_FILE: ci/deps/azure-37.yaml
25+
CONDA_PY: "37"
26+
PATTERN: "not slow and not network and not clipboard"
27+
2328
py37_locale_slow:
2429
ENV_FILE: ci/deps/azure-37-locale_slow.yaml
2530
CONDA_PY: "37"
2631
PATTERN: "slow"
27-
# pandas does not use the language (zh_CN), but should support different encodings (utf8)
28-
# we should test with encodings different than utf8, but doesn't seem like Ubuntu supports any
29-
LANG: "zh_CN.utf8"
30-
LC_ALL: "zh_CN.utf8"
31-
EXTRA_APT: "language-pack-zh-hans"
32+
LANG: "it_IT.utf8"
33+
LC_ALL: "it_IT.utf8"
34+
EXTRA_APT: "language-pack-it xsel"
3235

3336
py37_slow:
3437
ENV_FILE: ci/deps/azure-37-slow.yaml
3538
CONDA_PY: "37"
3639
PATTERN: "slow"
3740

38-
py37_locale:
39-
ENV_FILE: ci/deps/azure-37-locale.yaml
40-
CONDA_PY: "37"
41-
PATTERN: "not slow and not network"
42-
LANG: "it_IT.utf8"
43-
LC_ALL: "it_IT.utf8"
44-
EXTRA_APT: "language-pack-it xsel"
45-
46-
# py37_32bit:
47-
# ENV_FILE: ci/deps/azure-37-32bit.yaml
48-
# CONDA_PY: "37"
49-
# PATTERN: "not slow and not network and not clipboard"
50-
# BITS32: "yes"
41+
py38:
42+
ENV_FILE: ci/deps/azure-38.yaml
43+
CONDA_PY: "38"
44+
PATTERN: "not slow and not network and not clipboard"
5145

5246
py38_locale:
5347
ENV_FILE: ci/deps/azure-38-locale.yaml
5448
CONDA_PY: "38"
5549
PATTERN: "not slow and not network"
50+
# pandas does not use the language (zh_CN), but should support different encodings (utf8)
51+
# we should test with encodings different than utf8, but doesn't seem like Ubuntu supports any
5652
LANG: "zh_CN.utf8"
5753
LC_ALL: "zh_CN.utf8"
5854
EXTRA_APT: "language-pack-zh-hans xsel"

ci/deps/azure-37-32bit.yaml

-26
This file was deleted.

ci/deps/azure-37-slow.yaml

+1
Original file line numberDiff line numberDiff line change
@@ -10,6 +10,7 @@ dependencies:
1010
- pytest>=5.0.1
1111
- pytest-xdist>=1.21
1212
- hypothesis>=3.58.0
13+
- pytest-azurepipelines
1314

1415
# pandas dependencies
1516
- beautifulsoup4

ci/deps/travis-37.yaml renamed to ci/deps/azure-37.yaml

+1
Original file line numberDiff line numberDiff line change
@@ -10,6 +10,7 @@ dependencies:
1010
- pytest>=5.0.1
1111
- pytest-xdist>=1.21
1212
- hypothesis>=3.58.0
13+
- pytest-azurepipelines
1314

1415
# pandas dependencies
1516
- botocore>=1.11

ci/deps/travis-38.yaml renamed to ci/deps/azure-38.yaml

+1-1
Original file line numberDiff line numberDiff line change
@@ -10,11 +10,11 @@ dependencies:
1010
- pytest>=5.0.1
1111
- pytest-xdist>=1.21
1212
- hypothesis>=3.58.0
13+
- pytest-azurepipelines
1314

1415
# pandas dependencies
1516
- numpy
1617
- python-dateutil
1718
- nomkl
1819
- pytz
19-
- pip
2020
- tabulate==0.8.3

ci/deps/travis-37-locale.yaml

+14-8
Original file line numberDiff line numberDiff line change
@@ -11,7 +11,12 @@ dependencies:
1111
- pytest-xdist>=1.21
1212
- hypothesis>=3.58.0
1313

14-
# pandas dependencies
14+
# required
15+
- numpy
16+
- python-dateutil
17+
- pytz
18+
19+
# optional
1520
- beautifulsoup4
1621
- blosc=1.15.0
1722
- python-blosc
@@ -20,22 +25,23 @@ dependencies:
2025
- ipython
2126
- jinja2
2227
- lxml=4.3.0
23-
- matplotlib=3.0.*
28+
- matplotlib
2429
- nomkl
2530
- numexpr
26-
- numpy
2731
- openpyxl
2832
- pandas-gbq
2933
- google-cloud-bigquery>=1.27.2 # GH 36436
3034
- pyarrow>=0.17
31-
- psycopg2=2.7
32-
- pymysql=0.7.11
3335
- pytables>=3.5.1
34-
- python-dateutil
35-
- pytz
3636
- scipy
37-
- sqlalchemy=1.3.0
3837
- xarray=0.12.0
3938
- xlrd
4039
- xlsxwriter
4140
- xlwt
41+
- moto
42+
- flask
43+
44+
# sql
45+
- psycopg2=2.7
46+
- pymysql=0.7.11
47+
- sqlalchemy=1.3.0

ci/deps/azure-37-locale.yaml renamed to ci/deps/travis-38-slow.yaml

+11-11
Original file line numberDiff line numberDiff line change
@@ -3,35 +3,35 @@ channels:
33
- defaults
44
- conda-forge
55
dependencies:
6-
- python=3.7.*
6+
- python=3.8.*
77

88
# tools
99
- cython>=0.29.21
1010
- pytest>=5.0.1
1111
- pytest-xdist>=1.21
12-
- pytest-asyncio
1312
- hypothesis>=3.58.0
14-
- pytest-azurepipelines
1513

1614
# pandas dependencies
1715
- beautifulsoup4
16+
- fsspec>=0.7.4
1817
- html5lib
19-
- ipython
20-
- jinja2
2118
- lxml
22-
- matplotlib>=3.3.0
23-
- moto
24-
- flask
25-
- nomkl
19+
- matplotlib
2620
- numexpr
27-
- numpy=1.16.*
21+
- numpy
2822
- openpyxl
23+
- patsy
24+
- psycopg2
25+
- pymysql
2926
- pytables
3027
- python-dateutil
3128
- pytz
29+
- s3fs>=0.4.0
30+
- moto>=1.3.14
3231
- scipy
33-
- xarray
32+
- sqlalchemy
3433
- xlrd
3534
- xlsxwriter
3635
- xlwt
3736
- moto
37+
- flask

ci/travis_process_gbq_encryption.sh

-1
Original file line numberDiff line numberDiff line change
@@ -10,4 +10,3 @@ elif [[ -n ${!TRAVIS_IV_ENV} ]]; then
1010
export GBQ_PROJECT_ID='pandas-gbq-tests';
1111
echo 'Successfully decrypted gbq credentials'
1212
fi
13-

doc/data/iris.data

+1-1
Original file line numberDiff line numberDiff line change
@@ -148,4 +148,4 @@ SepalLength,SepalWidth,PetalLength,PetalWidth,Name
148148
6.3,2.5,5.0,1.9,Iris-virginica
149149
6.5,3.0,5.2,2.0,Iris-virginica
150150
6.2,3.4,5.4,2.3,Iris-virginica
151-
5.9,3.0,5.1,1.8,Iris-virginica
151+
5.9,3.0,5.1,1.8,Iris-virginica

doc/source/development/contributing.rst

+10-7
Original file line numberDiff line numberDiff line change
@@ -837,6 +837,9 @@ to run its checks by running::
837837

838838
without having to have done ``pre-commit install`` beforehand.
839839

840+
Note that if you have conflicting installations of ``virtualenv``, then you may get an
841+
error - see `here <https://github.com/pypa/virtualenv/issues/1875>`_.
842+
840843
Backwards compatibility
841844
~~~~~~~~~~~~~~~~~~~~~~~
842845

@@ -1362,16 +1365,16 @@ environments. If you want to use virtualenv instead, write::
13621365
The ``-E virtualenv`` option should be added to all ``asv`` commands
13631366
that run benchmarks. The default value is defined in ``asv.conf.json``.
13641367

1365-
Running the full test suite can take up to one hour and use up to 3GB of RAM.
1366-
Usually it is sufficient to paste only a subset of the results into the pull
1367-
request to show that the committed changes do not cause unexpected performance
1368-
regressions. You can run specific benchmarks using the ``-b`` flag, which
1369-
takes a regular expression. For example, this will only run tests from a
1370-
``pandas/asv_bench/benchmarks/groupby.py`` file::
1368+
Running the full benchmark suite can be an all-day process, depending on your
1369+
hardware and its resource utilization. However, usually it is sufficient to paste
1370+
only a subset of the results into the pull request to show that the committed changes
1371+
do not cause unexpected performance regressions. You can run specific benchmarks
1372+
using the ``-b`` flag, which takes a regular expression. For example, this will
1373+
only run benchmarks from a ``pandas/asv_bench/benchmarks/groupby.py`` file::
13711374

13721375
asv continuous -f 1.1 upstream/master HEAD -b ^groupby
13731376

1374-
If you want to only run a specific group of tests from a file, you can do it
1377+
If you want to only run a specific group of benchmarks from a file, you can do it
13751378
using ``.`` as a separator. For example::
13761379

13771380
asv continuous -f 1.1 upstream/master HEAD -b groupby.GroupByMethods

doc/source/development/developer.rst

+1-1
Original file line numberDiff line numberDiff line change
@@ -184,4 +184,4 @@ As an example of fully-formed metadata:
184184
'creator': {
185185
'library': 'pyarrow',
186186
'version': '0.13.0'
187-
}}
187+
}}

doc/source/getting_started/intro_tutorials/03_subset_data.rst

+5-5
Original file line numberDiff line numberDiff line change
@@ -27,14 +27,14 @@ This tutorial uses the Titanic data set, stored as CSV. The data
2727
consists of the following data columns:
2828

2929
- PassengerId: Id of every passenger.
30-
- Survived: This feature have value 0 and 1. 0 for not survived and 1
30+
- Survived: This feature has value 0 and 1. 0 for not survived and 1
3131
for survived.
3232
- Pclass: There are 3 classes: Class 1, Class 2 and Class 3.
3333
- Name: Name of passenger.
3434
- Sex: Gender of passenger.
3535
- Age: Age of passenger.
36-
- SibSp: Indication that passenger have siblings and spouse.
37-
- Parch: Whether a passenger is alone or have family.
36+
- SibSp: Indication that passengers have siblings and spouses.
37+
- Parch: Whether a passenger is alone or has a family.
3838
- Ticket: Ticket number of passenger.
3939
- Fare: Indicating the fare.
4040
- Cabin: The cabin of passenger.
@@ -199,7 +199,7 @@ selection brackets ``[]``. Only rows for which the value is ``True``
199199
will be selected.
200200

201201
We know from before that the original Titanic ``DataFrame`` consists of
202-
891 rows. Let’s have a look at the amount of rows which satisfy the
202+
891 rows. Let’s have a look at the number of rows which satisfy the
203203
condition by checking the ``shape`` attribute of the resulting
204204
``DataFrame`` ``above_35``:
205205

@@ -398,7 +398,7 @@ See the user guide section on :ref:`different choices for indexing <indexing.cho
398398
<div class="d-flex flex-row gs-torefguide">
399399
<span class="badge badge-info">To user guide</span>
400400

401-
A full overview about indexing is provided in the user guide pages on :ref:`indexing and selecting data <indexing>`.
401+
A full overview of indexing is provided in the user guide pages on :ref:`indexing and selecting data <indexing>`.
402402

403403
.. raw:: html
404404

doc/source/getting_started/intro_tutorials/04_plotting.rst

+2-2
Original file line numberDiff line numberDiff line change
@@ -167,7 +167,7 @@ I want each of the columns in a separate subplot.
167167
@savefig 04_airqual_area_subplot.png
168168
axs = air_quality.plot.area(figsize=(12, 4), subplots=True)
169169
170-
Separate subplots for each of the data columns is supported by the ``subplots`` argument
170+
Separate subplots for each of the data columns are supported by the ``subplots`` argument
171171
of the ``plot`` functions. The builtin options available in each of the pandas plot
172172
functions that are worthwhile to have a look.
173173

@@ -214,7 +214,7 @@ I want to further customize, extend or save the resulting plot.
214214
</li>
215215
</ul>
216216

217-
Each of the plot objects created by pandas are a
217+
Each of the plot objects created by pandas is a
218218
`matplotlib <https://matplotlib.org/>`__ object. As Matplotlib provides
219219
plenty of options to customize plots, making the link between pandas and
220220
Matplotlib explicit enables all the power of matplotlib to the plot.

doc/source/getting_started/intro_tutorials/06_calculate_statistics.rst

+4-1
Original file line numberDiff line numberDiff line change
@@ -123,7 +123,10 @@ aggregating statistics for given columns can be defined using the
123123
.. ipython:: python
124124
125125
titanic.agg(
126-
{"Age": ["min", "max", "median", "skew"], "Fare": ["min", "max", "median", "mean"]}
126+
{
127+
"Age": ["min", "max", "median", "skew"],
128+
"Fare": ["min", "max", "median", "mean"],
129+
}
127130
)
128131
129132
.. raw:: html

doc/source/getting_started/intro_tutorials/index.rst

-1
Original file line numberDiff line numberDiff line change
@@ -19,4 +19,3 @@ Getting started tutorials
1919
08_combine_dataframes
2020
09_timeseries
2121
10_text_data
22-

doc/source/getting_started/overview.rst

-1
Original file line numberDiff line numberDiff line change
@@ -174,4 +174,3 @@ License
174174
-------
175175

176176
.. literalinclude:: ../../../LICENSE
177-

doc/source/reference/general_utility_functions.rst

-1
Original file line numberDiff line numberDiff line change
@@ -122,4 +122,3 @@ Bug report function
122122
:toctree: api/
123123

124124
show_versions
125-

0 commit comments

Comments
 (0)