Skip to content

Commit c0b1ab2

Browse files
committed
Merge branch 'master' of https://github.com/pandas-dev/pandas into 19356
2 parents 1e4d572 + f34fe62 commit c0b1ab2

File tree

194 files changed

+5310
-4571
lines changed

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

194 files changed

+5310
-4571
lines changed

.github/workflows/ci.yml

+2-2
Original file line numberDiff line numberDiff line change
@@ -18,7 +18,7 @@ jobs:
1818
steps:
1919

2020
- name: Setting conda path
21-
run: echo "::add-path::${HOME}/miniconda3/bin"
21+
run: echo "${HOME}/miniconda3/bin" >> $GITHUB_PATH
2222

2323
- name: Checkout
2424
uses: actions/checkout@v1
@@ -98,7 +98,7 @@ jobs:
9898
steps:
9999

100100
- name: Setting conda path
101-
run: echo "::set-env name=PATH::${HOME}/miniconda3/bin:${PATH}"
101+
run: echo "${HOME}/miniconda3/bin" >> $GITHUB_PATH
102102

103103
- name: Checkout
104104
uses: actions/checkout@v1

.travis.yml

+1-6
Original file line numberDiff line numberDiff line change
@@ -35,11 +35,6 @@ matrix:
3535
fast_finish: true
3636

3737
include:
38-
- dist: bionic
39-
python: 3.9-dev
40-
env:
41-
- JOB="3.9-dev" PATTERN="(not slow and not network and not clipboard)"
42-
4338
- env:
4439
- JOB="3.8, slow" ENV_FILE="ci/deps/travis-38-slow.yaml" PATTERN="slow" SQL="1"
4540
services:
@@ -94,7 +89,7 @@ install:
9489
script:
9590
- echo "script start"
9691
- echo "$JOB"
97-
- if [ "$JOB" != "3.9-dev" ]; then source activate pandas-dev; fi
92+
- source activate pandas-dev
9893
- ci/run_tests.sh
9994

10095
after_script:
+164
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,164 @@
1+
import numpy as np
2+
3+
import pandas as pd
4+
5+
6+
class IsinAlmostFullWithRandomInt:
7+
params = [
8+
[np.float64, np.int64, np.uint64, np.object],
9+
range(10, 21),
10+
]
11+
param_names = ["dtype", "exponent"]
12+
13+
def setup(self, dtype, exponent):
14+
M = 3 * 2 ** (exponent - 2)
15+
# 0.77-the maximal share of occupied buckets
16+
np.random.seed(42)
17+
self.s = pd.Series(np.random.randint(0, M, M)).astype(dtype)
18+
self.values = np.random.randint(0, M, M).astype(dtype)
19+
self.values_outside = self.values + M
20+
21+
def time_isin(self, dtype, exponent):
22+
self.s.isin(self.values)
23+
24+
def time_isin_outside(self, dtype, exponent):
25+
self.s.isin(self.values_outside)
26+
27+
28+
class IsinWithRandomFloat:
29+
params = [
30+
[np.float64, np.object],
31+
[
32+
1_300,
33+
2_000,
34+
7_000,
35+
8_000,
36+
70_000,
37+
80_000,
38+
750_000,
39+
900_000,
40+
],
41+
]
42+
param_names = ["dtype", "M"]
43+
44+
def setup(self, dtype, M):
45+
np.random.seed(42)
46+
self.values = np.random.rand(M)
47+
self.s = pd.Series(self.values).astype(dtype)
48+
np.random.shuffle(self.values)
49+
self.values_outside = self.values + 0.1
50+
51+
def time_isin(self, dtype, M):
52+
self.s.isin(self.values)
53+
54+
def time_isin_outside(self, dtype, M):
55+
self.s.isin(self.values_outside)
56+
57+
58+
class IsinWithArangeSorted:
59+
params = [
60+
[np.float64, np.int64, np.uint64, np.object],
61+
[
62+
1_000,
63+
2_000,
64+
8_000,
65+
100_000,
66+
1_000_000,
67+
],
68+
]
69+
param_names = ["dtype", "M"]
70+
71+
def setup(self, dtype, M):
72+
self.s = pd.Series(np.arange(M)).astype(dtype)
73+
self.values = np.arange(M).astype(dtype)
74+
75+
def time_isin(self, dtype, M):
76+
self.s.isin(self.values)
77+
78+
79+
class IsinWithArange:
80+
params = [
81+
[np.float64, np.int64, np.uint64, np.object],
82+
[
83+
1_000,
84+
2_000,
85+
8_000,
86+
],
87+
[-2, 0, 2],
88+
]
89+
param_names = ["dtype", "M", "offset_factor"]
90+
91+
def setup(self, dtype, M, offset_factor):
92+
offset = int(M * offset_factor)
93+
np.random.seed(42)
94+
tmp = pd.Series(np.random.randint(offset, M + offset, 10 ** 6))
95+
self.s = tmp.astype(dtype)
96+
self.values = np.arange(M).astype(dtype)
97+
98+
def time_isin(self, dtype, M, offset_factor):
99+
self.s.isin(self.values)
100+
101+
102+
class Float64GroupIndex:
103+
# GH28303
104+
def setup(self):
105+
self.df = pd.date_range(
106+
start="1/1/2018", end="1/2/2018", periods=1e6
107+
).to_frame()
108+
self.group_index = np.round(self.df.index.astype(int) / 1e9)
109+
110+
def time_groupby(self):
111+
self.df.groupby(self.group_index).last()
112+
113+
114+
class UniqueAndFactorizeArange:
115+
params = range(4, 16)
116+
param_names = ["exponent"]
117+
118+
def setup(self, exponent):
119+
a = np.arange(10 ** 4, dtype="float64")
120+
self.a2 = (a + 10 ** exponent).repeat(100)
121+
122+
def time_factorize(self, exponent):
123+
pd.factorize(self.a2)
124+
125+
def time_unique(self, exponent):
126+
pd.unique(self.a2)
127+
128+
129+
class NumericSeriesIndexing:
130+
131+
params = [
132+
(pd.Int64Index, pd.UInt64Index, pd.Float64Index),
133+
(10 ** 4, 10 ** 5, 5 * 10 ** 5, 10 ** 6, 5 * 10 ** 6),
134+
]
135+
param_names = ["index_dtype", "N"]
136+
137+
def setup(self, index, N):
138+
vals = np.array(list(range(55)) + [54] + list(range(55, N - 1)))
139+
indices = index(vals)
140+
self.data = pd.Series(np.arange(N), index=indices)
141+
142+
def time_loc_slice(self, index, N):
143+
# trigger building of mapping
144+
self.data.loc[:800]
145+
146+
147+
class NumericSeriesIndexingShuffled:
148+
149+
params = [
150+
(pd.Int64Index, pd.UInt64Index, pd.Float64Index),
151+
(10 ** 4, 10 ** 5, 5 * 10 ** 5, 10 ** 6, 5 * 10 ** 6),
152+
]
153+
param_names = ["index_dtype", "N"]
154+
155+
def setup(self, index, N):
156+
vals = np.array(list(range(55)) + [54] + list(range(55, N - 1)))
157+
np.random.seed(42)
158+
np.random.shuffle(vals)
159+
indices = index(vals)
160+
self.data = pd.Series(np.arange(N), index=indices)
161+
162+
def time_loc_slice(self, index, N):
163+
# trigger building of mapping
164+
self.data.loc[:800]

ci/azure/posix.yml

+5
Original file line numberDiff line numberDiff line change
@@ -61,6 +61,11 @@ jobs:
6161
PANDAS_TESTING_MODE: "deprecate"
6262
EXTRA_APT: "xsel"
6363

64+
py39:
65+
ENV_FILE: ci/deps/azure-39.yaml
66+
CONDA_PY: "39"
67+
PATTERN: "not slow and not network and not clipboard"
68+
6469
steps:
6570
- script: |
6671
if [ "$(uname)" == "Linux" ]; then

ci/build39.sh

-12
This file was deleted.

ci/check_cache.sh

-27
This file was deleted.

ci/code_checks.sh

+1-1
Original file line numberDiff line numberDiff line change
@@ -225,7 +225,7 @@ fi
225225
### DOCSTRINGS ###
226226
if [[ -z "$CHECK" || "$CHECK" == "docstrings" ]]; then
227227

228-
MSG='Validate docstrings (GL03, GL04, GL05, GL06, GL07, GL09, GL10, SS02, SS04, SS05, PR03, PR04, PR05, PR10, EX04, RT01, RT04, RT05, SA02, SA03)' ; echo $MSG
228+
MSG='Validate docstrings (GL03, GL04, GL05, GL06, GL07, GL09, GL10, SS01, SS02, SS04, SS05, PR03, PR04, PR05, PR10, EX04, RT01, RT04, RT05, SA02, SA03)' ; echo $MSG
229229
$BASE_DIR/scripts/validate_docstrings.py --format=actions --errors=GL03,GL04,GL05,GL06,GL07,GL09,GL10,SS02,SS04,SS05,PR03,PR04,PR05,PR10,EX04,RT01,RT04,RT05,SA02,SA03
230230
RET=$(($RET + $?)) ; echo $MSG "DONE"
231231

ci/deps/azure-38-locale.yaml

+1-1
Original file line numberDiff line numberDiff line change
@@ -34,7 +34,7 @@ dependencies:
3434
- xlsxwriter
3535
- xlwt
3636
- moto
37-
- pyarrow>=0.15
37+
- pyarrow=1.0.0
3838
- pip
3939
- pip:
4040
- pyxlsb

ci/deps/azure-39.yaml

+17
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,17 @@
1+
name: pandas-dev
2+
channels:
3+
- conda-forge
4+
dependencies:
5+
- python=3.9.*
6+
7+
# tools
8+
- cython>=0.29.21
9+
- pytest>=5.0.1
10+
- pytest-xdist>=1.21
11+
- hypothesis>=3.58.0
12+
- pytest-azurepipelines
13+
14+
# pandas dependencies
15+
- numpy
16+
- python-dateutil
17+
- pytz

ci/setup_env.sh

-5
Original file line numberDiff line numberDiff line change
@@ -1,10 +1,5 @@
11
#!/bin/bash -e
22

3-
if [ "$JOB" == "3.9-dev" ]; then
4-
/bin/bash ci/build39.sh
5-
exit 0
6-
fi
7-
83
# edit the locale file if needed
94
if [[ "$(uname)" == "Linux" && -n "$LC_ALL" ]]; then
105
echo "Adding locale to the first line of pandas/__init__.py"

doc/source/development/contributing.rst

+1-1
Original file line numberDiff line numberDiff line change
@@ -442,7 +442,7 @@ Some other important things to know about the docs:
442442

443443
contributing_docstring.rst
444444

445-
* The tutorials make heavy use of the `ipython directive
445+
* The tutorials make heavy use of the `IPython directive
446446
<https://matplotlib.org/sampledoc/ipython_directive.html>`_ sphinx extension.
447447
This directive lets you put code in the documentation which will be run
448448
during the doc build. For example::

doc/source/development/contributing_docstring.rst

+5-5
Original file line numberDiff line numberDiff line change
@@ -63,14 +63,14 @@ The first conventions every Python docstring should follow are defined in
6363
`PEP-257 <https://www.python.org/dev/peps/pep-0257/>`_.
6464

6565
As PEP-257 is quite broad, other more specific standards also exist. In the
66-
case of pandas, the numpy docstring convention is followed. These conventions are
66+
case of pandas, the NumPy docstring convention is followed. These conventions are
6767
explained in this document:
6868

6969
* `numpydoc docstring guide <https://numpydoc.readthedocs.io/en/latest/format.html>`_
7070
(which is based in the original `Guide to NumPy/SciPy documentation
7171
<https://github.com/numpy/numpy/blob/master/doc/HOWTO_DOCUMENT.rst.txt>`_)
7272

73-
numpydoc is a Sphinx extension to support the numpy docstring convention.
73+
numpydoc is a Sphinx extension to support the NumPy docstring convention.
7474

7575
The standard uses reStructuredText (reST). reStructuredText is a markup
7676
language that allows encoding styles in plain text files. Documentation
@@ -401,7 +401,7 @@ DataFrame:
401401
* pandas.Categorical
402402
* pandas.arrays.SparseArray
403403

404-
If the exact type is not relevant, but must be compatible with a numpy
404+
If the exact type is not relevant, but must be compatible with a NumPy
405405
array, array-like can be specified. If Any type that can be iterated is
406406
accepted, iterable can be used:
407407

@@ -819,7 +819,7 @@ positional arguments ``head(3)``.
819819
"""
820820
A sample DataFrame method.
821821
822-
Do not import numpy and pandas.
822+
Do not import NumPy and pandas.
823823
824824
Try to use meaningful data, when it makes the example easier
825825
to understand.
@@ -854,7 +854,7 @@ Tips for getting your examples pass the doctests
854854
Getting the examples pass the doctests in the validation script can sometimes
855855
be tricky. Here are some attention points:
856856

857-
* Import all needed libraries (except for pandas and numpy, those are already
857+
* Import all needed libraries (except for pandas and NumPy, those are already
858858
imported as ``import pandas as pd`` and ``import numpy as np``) and define
859859
all variables you use in the example.
860860

doc/source/development/extending.rst

+1-1
Original file line numberDiff line numberDiff line change
@@ -219,7 +219,7 @@ and re-boxes it if necessary.
219219

220220
If applicable, we highly recommend that you implement ``__array_ufunc__`` in your
221221
extension array to avoid coercion to an ndarray. See
222-
`the numpy documentation <https://numpy.org/doc/stable/reference/generated/numpy.lib.mixins.NDArrayOperatorsMixin.html>`__
222+
`the NumPy documentation <https://numpy.org/doc/stable/reference/generated/numpy.lib.mixins.NDArrayOperatorsMixin.html>`__
223223
for an example.
224224

225225
As part of your implementation, we require that you defer to pandas when a pandas

doc/source/development/index.rst

+1
Original file line numberDiff line numberDiff line change
@@ -16,6 +16,7 @@ Development
1616
code_style
1717
maintaining
1818
internals
19+
test_writing
1920
extending
2021
developer
2122
policies

0 commit comments

Comments
 (0)