Skip to content

Commit 6e2d063

Browse files
Merge remote-tracking branch 'upstream/master' into bisect
2 parents b9f0344 + 5756039 commit 6e2d063

File tree

351 files changed

+15035
-10661
lines changed

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

351 files changed

+15035
-10661
lines changed

.github/workflows/ci.yml

+1-1
Original file line numberDiff line numberDiff line change
@@ -125,7 +125,7 @@ jobs:
125125
# This can be removed when the ipython directive fails when there are errors,
126126
# including the `tee sphinx.log` in te previous step (https://github.com/ipython/ipython/issues/11547)
127127
- name: Check ipython directive errors
128-
run: "! grep -B1 \"^<<<-------------------------------------------------------------------------$\" sphinx.log"
128+
run: "! grep -B10 \"^<<<-------------------------------------------------------------------------$\" sphinx.log"
129129

130130
- name: Install ssh key
131131
run: |

.github/workflows/stale-pr.yml

+3-3
Original file line numberDiff line numberDiff line change
@@ -2,7 +2,7 @@ name: "Stale PRs"
22
on:
33
schedule:
44
# * is a special character in YAML so you have to quote this string
5-
- cron: "0 */6 * * *"
5+
- cron: "0 0 * * *"
66

77
jobs:
88
stale:
@@ -11,8 +11,8 @@ jobs:
1111
- uses: actions/stale@v3
1212
with:
1313
repo-token: ${{ secrets.GITHUB_TOKEN }}
14-
stale-pr-message: "This pull request is stale because it has been open for thirty days with no activity."
15-
skip-stale-pr-message: true
14+
stale-pr-message: "This pull request is stale because it has been open for thirty days with no activity. Please update or respond to this comment if you're still interested in working on this."
15+
skip-stale-pr-message: false
1616
stale-pr-label: "Stale"
1717
exempt-pr-labels: "Needs Review,Blocked,Needs Discussion"
1818
days-before-stale: 30

.pre-commit-config.yaml

+5-15
Original file line numberDiff line numberDiff line change
@@ -4,7 +4,7 @@ repos:
44
hooks:
55
- id: black
66
- repo: https://gitlab.com/pycqa/flake8
7-
rev: 3.8.3
7+
rev: 3.8.4
88
hooks:
99
- id: flake8
1010
additional_dependencies: [flake8-comprehensions>=3.1.0]
@@ -34,20 +34,6 @@ repos:
3434
rev: v1.6.0
3535
hooks:
3636
- id: rst-backticks
37-
# these exclusions should be removed and the files fixed
38-
exclude: (?x)(
39-
categorical\.rst|
40-
contributing\.rst|
41-
contributing_docstring\.rst|
42-
extending\.rst|
43-
ecosystem\.rst|
44-
comparison_with_sql\.rst|
45-
install\.rst|
46-
calculate_statistics\.rst|
47-
combine_dataframes\.rst|
48-
v0\.|
49-
v1\.0\.|
50-
v1\.1\.[012])
5137
- repo: local
5238
hooks:
5339
- id: pip_to_conda
@@ -57,3 +43,7 @@ repos:
5743
entry: python -m scripts.generate_pip_deps_from_conda
5844
files: ^(environment.yml|requirements-dev.txt)$
5945
pass_filenames: false
46+
- repo: https://github.com/asottile/yesqa
47+
rev: v1.2.2
48+
hooks:
49+
- id: yesqa

.travis.yml

+10-4
Original file line numberDiff line numberDiff line change
@@ -46,16 +46,16 @@ matrix:
4646
- env:
4747
- JOB="3.7" ENV_FILE="ci/deps/travis-37.yaml" PATTERN="(not slow and not network and not clipboard)"
4848

49-
- arch: arm64
50-
env:
51-
- JOB="3.7, arm64" PYTEST_WORKERS=1 ENV_FILE="ci/deps/travis-37-arm64.yaml" PATTERN="(not slow and not network and not clipboard and not arm_slow)"
52-
5349
- env:
5450
- JOB="3.7, locale" ENV_FILE="ci/deps/travis-37-locale.yaml" PATTERN="((not slow and not network and not clipboard) or (single and db))" LOCALE_OVERRIDE="zh_CN.UTF-8" SQL="1"
5551
services:
5652
- mysql
5753
- postgresql
5854

55+
- arch: arm64
56+
env:
57+
- JOB="3.7, arm64" PYTEST_WORKERS=1 ENV_FILE="ci/deps/travis-37-arm64.yaml" PATTERN="(not slow and not network and not clipboard and not arm_slow)"
58+
5959
- env:
6060
# Enabling Deprecations when running tests
6161
# PANDAS_TESTING_MODE="deprecate" causes DeprecationWarning messages to be displayed in the logs
@@ -65,6 +65,12 @@ matrix:
6565
- mysql
6666
- postgresql
6767

68+
allow_failures:
69+
# Moved to allowed_failures 2020-09-29 due to timeouts https://github.com/pandas-dev/pandas/issues/36719
70+
- arch: arm64
71+
env:
72+
- JOB="3.7, arm64" PYTEST_WORKERS=1 ENV_FILE="ci/deps/travis-37-arm64.yaml" PATTERN="(not slow and not network and not clipboard and not arm_slow)"
73+
6874

6975
before_install:
7076
- echo "before_install"

asv_bench/benchmarks/dtypes.py

+57
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,9 @@
1+
import string
2+
13
import numpy as np
24

5+
from pandas import DataFrame
6+
import pandas._testing as tm
37
from pandas.api.types import pandas_dtype
48

59
from .pandas_vb_common import (
@@ -62,4 +66,57 @@ def time_infer(self, dtype):
6266
lib.infer_dtype(self.data_dict[dtype], skipna=False)
6367

6468

69+
class SelectDtypes:
70+
71+
params = [
72+
tm.ALL_INT_DTYPES
73+
+ tm.ALL_EA_INT_DTYPES
74+
+ tm.FLOAT_DTYPES
75+
+ tm.COMPLEX_DTYPES
76+
+ tm.DATETIME64_DTYPES
77+
+ tm.TIMEDELTA64_DTYPES
78+
+ tm.BOOL_DTYPES
79+
]
80+
param_names = ["dtype"]
81+
82+
def setup(self, dtype):
83+
N, K = 5000, 50
84+
self.index = tm.makeStringIndex(N)
85+
self.columns = tm.makeStringIndex(K)
86+
87+
def create_df(data):
88+
return DataFrame(data, index=self.index, columns=self.columns)
89+
90+
self.df_int = create_df(np.random.randint(low=100, size=(N, K)))
91+
self.df_float = create_df(np.random.randn(N, K))
92+
self.df_bool = create_df(np.random.choice([True, False], size=(N, K)))
93+
self.df_string = create_df(
94+
np.random.choice(list(string.ascii_letters), size=(N, K))
95+
)
96+
97+
def time_select_dtype_int_include(self, dtype):
98+
self.df_int.select_dtypes(include=dtype)
99+
100+
def time_select_dtype_int_exclude(self, dtype):
101+
self.df_int.select_dtypes(exclude=dtype)
102+
103+
def time_select_dtype_float_include(self, dtype):
104+
self.df_float.select_dtypes(include=dtype)
105+
106+
def time_select_dtype_float_exclude(self, dtype):
107+
self.df_float.select_dtypes(exclude=dtype)
108+
109+
def time_select_dtype_bool_include(self, dtype):
110+
self.df_bool.select_dtypes(include=dtype)
111+
112+
def time_select_dtype_bool_exclude(self, dtype):
113+
self.df_bool.select_dtypes(exclude=dtype)
114+
115+
def time_select_dtype_string_include(self, dtype):
116+
self.df_string.select_dtypes(include=dtype)
117+
118+
def time_select_dtype_string_exclude(self, dtype):
119+
self.df_string.select_dtypes(exclude=dtype)
120+
121+
65122
from .pandas_vb_common import setup # noqa: F401 isort:skip

asv_bench/benchmarks/indexing.py

+1-1
Original file line numberDiff line numberDiff line change
@@ -191,7 +191,7 @@ def setup(self, index):
191191
}
192192
index = indexes[index]
193193
self.s = Series(np.random.rand(N), index=index)
194-
self.indexer = [True, False, True, True, False] * 20000
194+
self.indexer = np.random.randint(0, N, size=N)
195195

196196
def time_take(self, index):
197197
self.s.take(self.indexer)

asv_bench/benchmarks/pandas_vb_common.py

+1-1
Original file line numberDiff line numberDiff line change
@@ -15,7 +15,7 @@
1515

1616
# Compatibility import for the testing module
1717
try:
18-
import pandas._testing as tm # noqa
18+
import pandas._testing as tm
1919
except ImportError:
2020
import pandas.util.testing as tm # noqa
2121

asv_bench/benchmarks/tslibs/offsets.py

+1-1
Original file line numberDiff line numberDiff line change
@@ -9,7 +9,7 @@
99
from pandas import offsets
1010

1111
try:
12-
import pandas.tseries.holiday # noqa
12+
import pandas.tseries.holiday
1313
except ImportError:
1414
pass
1515

ci/code_checks.sh

+1-1
Original file line numberDiff line numberDiff line change
@@ -335,7 +335,7 @@ if [[ -z "$CHECK" || "$CHECK" == "doctests" ]]; then
335335
RET=$(($RET + $?)) ; echo $MSG "DONE"
336336

337337
MSG='Doctests strings.py' ; echo $MSG
338-
pytest -q --doctest-modules pandas/core/strings.py
338+
pytest -q --doctest-modules pandas/core/strings/
339339
RET=$(($RET + $?)) ; echo $MSG "DONE"
340340

341341
# Directories

ci/deps/azure-37-minimum_versions.yaml

+1-1
Original file line numberDiff line numberDiff line change
@@ -20,7 +20,7 @@ dependencies:
2020
- numexpr=2.6.8
2121
- numpy=1.16.5
2222
- openpyxl=2.6.0
23-
- pytables=3.4.4
23+
- pytables=3.5.1
2424
- python-dateutil=2.7.3
2525
- pytz=2017.3
2626
- pyarrow=0.15

ci/deps/travis-37-locale.yaml

+2-2
Original file line numberDiff line numberDiff line change
@@ -13,7 +13,7 @@ dependencies:
1313

1414
# pandas dependencies
1515
- beautifulsoup4
16-
- blosc=1.14.3
16+
- blosc=1.15.0
1717
- python-blosc
1818
- fastparquet=0.3.2
1919
- html5lib
@@ -30,7 +30,7 @@ dependencies:
3030
- pyarrow>=0.17
3131
- psycopg2=2.7
3232
- pymysql=0.7.11
33-
- pytables
33+
- pytables>=3.5.1
3434
- python-dateutil
3535
- pytz
3636
- scipy

codecov.yml

+4-1
Original file line numberDiff line numberDiff line change
@@ -1,7 +1,7 @@
11
codecov:
22
branch: master
33

4-
comment: off
4+
comment: false
55

66
coverage:
77
status:
@@ -11,3 +11,6 @@ coverage:
1111
patch:
1212
default:
1313
target: '50'
14+
15+
github_checks:
16+
annotations: false

doc/source/conf.py

+5-5
Original file line numberDiff line numberDiff line change
@@ -146,7 +146,7 @@
146146
# built documents.
147147
#
148148
# The short X.Y version.
149-
import pandas # noqa: E402 isort:skip
149+
import pandas # isort:skip
150150

151151
# version = '%s r%s' % (pandas.__version__, svn_version())
152152
version = str(pandas.__version__)
@@ -441,14 +441,14 @@
441441
# Add custom Documenter to handle attributes/methods of an AccessorProperty
442442
# eg pandas.Series.str and pandas.Series.dt (see GH9322)
443443

444-
import sphinx # noqa: E402 isort:skip
445-
from sphinx.util import rpartition # noqa: E402 isort:skip
446-
from sphinx.ext.autodoc import ( # noqa: E402 isort:skip
444+
import sphinx # isort:skip
445+
from sphinx.util import rpartition # isort:skip
446+
from sphinx.ext.autodoc import ( # isort:skip
447447
AttributeDocumenter,
448448
Documenter,
449449
MethodDocumenter,
450450
)
451-
from sphinx.ext.autosummary import Autosummary # noqa: E402 isort:skip
451+
from sphinx.ext.autosummary import Autosummary # isort:skip
452452

453453

454454
class AccessorDocumenter(MethodDocumenter):

doc/source/development/code_style.rst

+3-2
Original file line numberDiff line numberDiff line change
@@ -9,7 +9,7 @@ pandas code style guide
99
.. contents:: Table of contents:
1010
:local:
1111

12-
*pandas* follows the `PEP8 <https://www.python.org/dev/peps/pep-0008/>`_
12+
pandas follows the `PEP8 <https://www.python.org/dev/peps/pep-0008/>`_
1313
standard and uses `Black <https://black.readthedocs.io/en/stable/>`_
1414
and `Flake8 <https://flake8.pycqa.org/en/latest/>`_ to ensure a
1515
consistent code format throughout the project. For details see the
@@ -172,5 +172,6 @@ Reading from a url
172172
.. code-block:: python
173173
174174
from pandas.io.common import urlopen
175-
with urlopen('http://www.google.com') as url:
175+
176+
with urlopen("http://www.google.com") as url:
176177
raw_text = url.read()

doc/source/development/contributing.rst

+12-12
Original file line numberDiff line numberDiff line change
@@ -31,13 +31,13 @@ comment letting others know they are working on an issue. While this is ok, you
3131
check each issue individually, and it's not possible to find the unassigned ones.
3232

3333
For this reason, we implemented a workaround consisting of adding a comment with the exact
34-
text `take`. When you do it, a GitHub action will automatically assign you the issue
34+
text ``take``. When you do it, a GitHub action will automatically assign you the issue
3535
(this will take seconds, and may require refreshing the page to see it).
3636
By doing this, it's possible to filter the list of issues and find only the unassigned ones.
3737

3838
So, a good way to find an issue to start contributing to pandas is to check the list of
3939
`unassigned good first issues <https://github.com/pandas-dev/pandas/issues?q=is%3Aopen+is%3Aissue+label%3A%22good+first+issue%22+no%3Aassignee>`_
40-
and assign yourself one you like by writing a comment with the exact text `take`.
40+
and assign yourself one you like by writing a comment with the exact text ``take``.
4141

4242
If for whatever reason you are not able to continue working with the issue, please try to
4343
unassign it, so other people know it's available again. You can check the list of
@@ -133,7 +133,7 @@ want to clone your fork to your machine::
133133
cd pandas-yourname
134134
git remote add upstream https://github.com/pandas-dev/pandas.git
135135

136-
This creates the directory `pandas-yourname` and connects your repository to
136+
This creates the directory ``pandas-yourname`` and connects your repository to
137137
the upstream (main project) *pandas* repository.
138138

139139
Note that performing a shallow clone (with ``--depth==N``, for some ``N`` greater
@@ -155,12 +155,12 @@ Using a Docker container
155155

156156
Instead of manually setting up a development environment, you can use `Docker
157157
<https://docs.docker.com/get-docker/>`_ to automatically create the environment with just several
158-
commands. Pandas provides a `DockerFile` in the root directory to build a Docker image
158+
commands. pandas provides a ``DockerFile`` in the root directory to build a Docker image
159159
with a full pandas development environment.
160160

161161
**Docker Commands**
162162

163-
Pass your GitHub username in the `DockerFile` to use your own fork::
163+
Pass your GitHub username in the ``DockerFile`` to use your own fork::
164164

165165
# Build the image pandas-yourname-env
166166
docker build --tag pandas-yourname-env .
@@ -172,7 +172,7 @@ Even easier, you can integrate Docker with the following IDEs:
172172
**Visual Studio Code**
173173

174174
You can use the DockerFile to launch a remote session with Visual Studio Code,
175-
a popular free IDE, using the `.devcontainer.json` file.
175+
a popular free IDE, using the ``.devcontainer.json`` file.
176176
See https://code.visualstudio.com/docs/remote/containers for details.
177177

178178
**PyCharm (Professional)**
@@ -190,7 +190,7 @@ Note that you might need to rebuild the C extensions if/when you merge with upst
190190
Installing a C compiler
191191
~~~~~~~~~~~~~~~~~~~~~~~
192192

193-
Pandas uses C extensions (mostly written using Cython) to speed up certain
193+
pandas uses C extensions (mostly written using Cython) to speed up certain
194194
operations. To install pandas from source, you need to compile these C
195195
extensions, which means you need a C compiler. This process depends on which
196196
platform you're using.
@@ -782,7 +782,7 @@ As part of :ref:`Continuous Integration <contributing.ci>` checks we run::
782782

783783
isort --check-only pandas
784784

785-
to check that imports are correctly formatted as per the `setup.cfg`.
785+
to check that imports are correctly formatted as per the ``setup.cfg``.
786786

787787
If you see output like the below in :ref:`Continuous Integration <contributing.ci>` checks:
788788

@@ -979,7 +979,7 @@ For example, quite a few functions in pandas accept a ``dtype`` argument. This c
979979
def as_type(dtype: Dtype) -> ...:
980980
...
981981
982-
This module will ultimately house types for repeatedly used concepts like "path-like", "array-like", "numeric", etc... and can also hold aliases for commonly appearing parameters like `axis`. Development of this module is active so be sure to refer to the source for the most up to date list of available types.
982+
This module will ultimately house types for repeatedly used concepts like "path-like", "array-like", "numeric", etc... and can also hold aliases for commonly appearing parameters like ``axis``. Development of this module is active so be sure to refer to the source for the most up to date list of available types.
983983

984984
Validating type hints
985985
~~~~~~~~~~~~~~~~~~~~~
@@ -1219,7 +1219,7 @@ This test shows off several useful features of Hypothesis, as well as
12191219
demonstrating a good use-case: checking properties that should hold over
12201220
a large or complicated domain of inputs.
12211221

1222-
To keep the Pandas test suite running quickly, parametrized tests are
1222+
To keep the pandas test suite running quickly, parametrized tests are
12231223
preferred if the inputs or logic are simple, with Hypothesis tests reserved
12241224
for cases with complex logic or where there are too many combinations of
12251225
options or subtle interactions to test (or think of!) all of them.
@@ -1302,7 +1302,7 @@ Or with one of the following constructs::
13021302

13031303
Using `pytest-xdist <https://pypi.org/project/pytest-xdist>`_, one can
13041304
speed up local testing on multicore machines. To use this feature, you will
1305-
need to install `pytest-xdist` via::
1305+
need to install ``pytest-xdist`` via::
13061306

13071307
pip install pytest-xdist
13081308

@@ -1465,7 +1465,7 @@ The following defines how a commit message should be structured. Please referen
14651465
relevant GitHub issues in your commit message using GH1234 or #1234. Either style
14661466
is fine, but the former is generally preferred:
14671467

1468-
* a subject line with `< 80` chars.
1468+
* a subject line with ``< 80`` chars.
14691469
* One blank line.
14701470
* Optionally, a commit message body.
14711471

0 commit comments

Comments
 (0)