Skip to content

Commit 9814800

Browse files
committed
Merge branch 'master' into duplicate-cut
2 parents bb70ece + c1aea79 commit 9814800

File tree

171 files changed

+3159
-2095
lines changed

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

171 files changed

+3159
-2095
lines changed

.github/workflows/ci.yml

+5-4
Original file line numberDiff line numberDiff line change
@@ -4,6 +4,7 @@ on:
44
push:
55
branches:
66
- master
7+
- 1.3.x
78
pull_request:
89
branches:
910
- master
@@ -23,7 +24,7 @@ jobs:
2324

2425
concurrency:
2526
group: ${{ github.ref }}-checks
26-
cancel-in-progress: true
27+
cancel-in-progress: ${{github.event_name == 'pull_request'}}
2728

2829
steps:
2930
- name: Checkout
@@ -132,15 +133,15 @@ jobs:
132133
echo "${{ secrets.server_ssh_key }}" > ~/.ssh/id_rsa
133134
chmod 600 ~/.ssh/id_rsa
134135
echo "${{ secrets.server_ip }} ecdsa-sha2-nistp256 AAAAE2VjZHNhLXNoYTItbmlzdHAyNTYAAAAIbmlzdHAyNTYAAABBBE1Kkopomm7FHG5enATf7SgnpICZ4W2bw+Ho+afqin+w7sMcrsa0je7sbztFAV8YchDkiBKnWTG4cRT+KZgZCaY=" > ~/.ssh/known_hosts
135-
if: github.event_name == 'push'
136+
if: ${{github.event_name == 'push' && github.ref == 'refs/heads/master'}}
136137

137138
- name: Upload web
138139
run: rsync -az --delete --exclude='pandas-docs' --exclude='docs' --exclude='Pandas_Cheat_Sheet*' web/build/ docs@${{ secrets.server_ip }}:/usr/share/nginx/pandas
139-
if: github.event_name == 'push'
140+
if: ${{github.event_name == 'push' && github.ref == 'refs/heads/master'}}
140141

141142
- name: Upload dev docs
142143
run: rsync -az --delete doc/build/html/ docs@${{ secrets.server_ip }}:/usr/share/nginx/pandas/pandas-docs/dev
143-
if: github.event_name == 'push'
144+
if: ${{github.event_name == 'push' && github.ref == 'refs/heads/master'}}
144145

145146
- name: Move docs into site directory
146147
run: mv doc/build/html web/build/docs

.github/workflows/database.yml

+1-1
Original file line numberDiff line numberDiff line change
@@ -31,7 +31,7 @@ jobs:
3131

3232
concurrency:
3333
group: ${{ github.ref }}-${{ matrix.ENV_FILE }}
34-
cancel-in-progress: true
34+
cancel-in-progress: ${{github.event_name == 'pull_request'}}
3535

3636
services:
3737
mysql:

.github/workflows/posix.yml

+2-1
Original file line numberDiff line numberDiff line change
@@ -4,6 +4,7 @@ on:
44
push:
55
branches:
66
- master
7+
- 1.3.x
78
pull_request:
89
branches:
910
- master
@@ -45,7 +46,7 @@ jobs:
4546
TEST_ARGS: ${{ matrix.settings[6] }}
4647
concurrency:
4748
group: ${{ github.ref }}-${{ matrix.settings[0] }}
48-
cancel-in-progress: true
49+
cancel-in-progress: ${{github.event_name == 'pull_request'}}
4950

5051
steps:
5152
- name: Checkout

.github/workflows/pre-commit.yml

+2-1
Original file line numberDiff line numberDiff line change
@@ -5,13 +5,14 @@ on:
55
push:
66
branches:
77
- master
8+
- 1.3.x
89

910
jobs:
1011
pre-commit:
1112
runs-on: ubuntu-latest
1213
concurrency:
1314
group: ${{ github.ref }}-pre-commit
14-
cancel-in-progress: true
15+
cancel-in-progress: ${{github.event_name == 'pull_request'}}
1516
steps:
1617
- uses: actions/checkout@v2
1718
- uses: actions/setup-python@v2

.github/workflows/python-dev.yml

+2-1
Original file line numberDiff line numberDiff line change
@@ -8,6 +8,7 @@ on:
88
pull_request:
99
branches:
1010
- master
11+
- 1.3.x
1112
paths-ignore:
1213
- "doc/**"
1314

@@ -25,7 +26,7 @@ jobs:
2526

2627
concurrency:
2728
group: ${{ github.ref }}-dev
28-
cancel-in-progress: true
29+
cancel-in-progress: ${{github.event_name == 'pull_request'}}
2930

3031
steps:
3132
- uses: actions/checkout@v2

.github/workflows/sdist.yml

+4
Original file line numberDiff line numberDiff line change
@@ -4,6 +4,7 @@ on:
44
push:
55
branches:
66
- master
7+
- 1.3.x
78
pull_request:
89
branches:
910
- master
@@ -23,6 +24,9 @@ jobs:
2324
fail-fast: false
2425
matrix:
2526
python-version: ["3.8", "3.9"]
27+
concurrency:
28+
group: ${{github.ref}}-${{matrix.python-version}}-sdist
29+
cancel-in-progress: ${{github.event_name == 'pull_request'}}
2630

2731
steps:
2832
- uses: actions/checkout@v2

.pre-commit-config.yaml

+2-2
Original file line numberDiff line numberDiff line change
@@ -53,11 +53,11 @@ repos:
5353
types: [text]
5454
args: [--append-config=flake8/cython-template.cfg]
5555
- repo: https://github.com/PyCQA/isort
56-
rev: 5.9.1
56+
rev: 5.9.2
5757
hooks:
5858
- id: isort
5959
- repo: https://github.com/asottile/pyupgrade
60-
rev: v2.20.0
60+
rev: v2.21.0
6161
hooks:
6262
- id: pyupgrade
6363
args: [--py38-plus]

asv_bench/benchmarks/algos/isin.py

+3-15
Original file line numberDiff line numberDiff line change
@@ -1,10 +1,5 @@
11
import numpy as np
22

3-
try:
4-
from pandas.compat import np_version_under1p20
5-
except ImportError:
6-
from pandas.compat.numpy import _np_version_under1p20 as np_version_under1p20
7-
83
from pandas import (
94
Categorical,
105
NaT,
@@ -283,10 +278,6 @@ class IsInLongSeriesLookUpDominates:
283278
def setup(self, dtype, MaxNumber, series_type):
284279
N = 10 ** 7
285280

286-
# https://github.com/pandas-dev/pandas/issues/39844
287-
if not np_version_under1p20 and dtype in ("Int64", "Float64"):
288-
raise NotImplementedError
289-
290281
if series_type == "random_hits":
291282
array = np.random.randint(0, MaxNumber, N)
292283
if series_type == "random_misses":
@@ -297,7 +288,8 @@ def setup(self, dtype, MaxNumber, series_type):
297288
array = np.arange(N) + MaxNumber
298289

299290
self.series = Series(array).astype(dtype)
300-
self.values = np.arange(MaxNumber).astype(dtype)
291+
292+
self.values = np.arange(MaxNumber).astype(dtype.lower())
301293

302294
def time_isin(self, dtypes, MaxNumber, series_type):
303295
self.series.isin(self.values)
@@ -313,16 +305,12 @@ class IsInLongSeriesValuesDominate:
313305
def setup(self, dtype, series_type):
314306
N = 10 ** 7
315307

316-
# https://github.com/pandas-dev/pandas/issues/39844
317-
if not np_version_under1p20 and dtype in ("Int64", "Float64"):
318-
raise NotImplementedError
319-
320308
if series_type == "random":
321309
vals = np.random.randint(0, 10 * N, N)
322310
if series_type == "monotone":
323311
vals = np.arange(N)
324312

325-
self.values = vals.astype(dtype)
313+
self.values = vals.astype(dtype.lower())
326314
M = 10 ** 6 + 1
327315
self.series = Series(np.arange(M)).astype(dtype)
328316

asv_bench/benchmarks/frame_methods.py

+16
Original file line numberDiff line numberDiff line change
@@ -232,6 +232,22 @@ def time_to_html_mixed(self):
232232
self.df2.to_html()
233233

234234

235+
class ToDict:
236+
params = [["dict", "list", "series", "split", "records", "index"]]
237+
param_names = ["orient"]
238+
239+
def setup(self, orient):
240+
data = np.random.randint(0, 1000, size=(10000, 4))
241+
self.int_df = DataFrame(data)
242+
self.datetimelike_df = self.int_df.astype("timedelta64[ns]")
243+
244+
def time_to_dict_ints(self, orient):
245+
self.int_df.to_dict(orient=orient)
246+
247+
def time_to_dict_datetimelike(self, orient):
248+
self.datetimelike_df.to_dict(orient=orient)
249+
250+
235251
class ToNumpy:
236252
def setup(self):
237253
N = 10000

asv_bench/benchmarks/io/csv.py

+3-2
Original file line numberDiff line numberDiff line change
@@ -291,7 +291,8 @@ class ReadCSVFloatPrecision(StringIORewind):
291291

292292
def setup(self, sep, decimal, float_precision):
293293
floats = [
294-
"".join(random.choice(string.digits) for _ in range(28)) for _ in range(15)
294+
"".join([random.choice(string.digits) for _ in range(28)])
295+
for _ in range(15)
295296
]
296297
rows = sep.join([f"0{decimal}" + "{}"] * 3) + "\n"
297298
data = rows * 5
@@ -395,7 +396,7 @@ class ReadCSVCachedParseDates(StringIORewind):
395396
param_names = ["do_cache", "engine"]
396397

397398
def setup(self, do_cache, engine):
398-
data = ("\n".join(f"10/{year}" for year in range(2000, 2100)) + "\n") * 10
399+
data = ("\n".join([f"10/{year}" for year in range(2000, 2100)]) + "\n") * 10
399400
self.StringIO_input = StringIO(data)
400401

401402
def time_read_csv_cached(self, do_cache, engine):

azure-pipelines.yml

+5-3
Original file line numberDiff line numberDiff line change
@@ -9,9 +9,11 @@ trigger:
99
- 'doc/*'
1010

1111
pr:
12-
- master
13-
- 1.2.x
14-
- 1.3.x
12+
autoCancel: true
13+
branches:
14+
include:
15+
- master
16+
- 1.3.x
1517

1618
variables:
1719
PYTEST_WORKERS: auto

ci/code_checks.sh

+4-28
Original file line numberDiff line numberDiff line change
@@ -107,44 +107,20 @@ fi
107107
### DOCTESTS ###
108108
if [[ -z "$CHECK" || "$CHECK" == "doctests" ]]; then
109109

110-
MSG='Doctests for individual files' ; echo $MSG
111-
pytest -q --doctest-modules \
112-
pandas/core/accessor.py \
113-
pandas/core/aggregation.py \
114-
pandas/core/algorithms.py \
115-
pandas/core/base.py \
116-
pandas/core/construction.py \
117-
pandas/core/frame.py \
118-
pandas/core/generic.py \
119-
pandas/core/indexers.py \
120-
pandas/core/nanops.py \
121-
pandas/core/series.py \
122-
pandas/io/sql.py
123-
RET=$(($RET + $?)) ; echo $MSG "DONE"
124-
125-
MSG='Doctests for directories' ; echo $MSG
126-
pytest -q --doctest-modules \
110+
MSG='Doctests' ; echo $MSG
111+
python -m pytest --doctest-modules \
127112
pandas/_libs/ \
128113
pandas/api/ \
129114
pandas/arrays/ \
130115
pandas/compat/ \
131-
pandas/core/array_algos/ \
132-
pandas/core/arrays/ \
133-
pandas/core/computation/ \
134-
pandas/core/dtypes/ \
135-
pandas/core/groupby/ \
136-
pandas/core/indexes/ \
137-
pandas/core/ops/ \
138-
pandas/core/reshape/ \
139-
pandas/core/strings/ \
140-
pandas/core/tools/ \
141-
pandas/core/window/ \
116+
pandas/core \
142117
pandas/errors/ \
143118
pandas/io/clipboard/ \
144119
pandas/io/json/ \
145120
pandas/io/excel/ \
146121
pandas/io/parsers/ \
147122
pandas/io/sas/ \
123+
pandas/io/sql.py \
148124
pandas/tseries/
149125
RET=$(($RET + $?)) ; echo $MSG "DONE"
150126

ci/deps/actions-38-db.yaml

+1-1
Original file line numberDiff line numberDiff line change
@@ -15,7 +15,7 @@ dependencies:
1515
- beautifulsoup4
1616
- botocore>=1.11
1717
- dask
18-
- fastparquet>=0.4.0
18+
- fastparquet>=0.4.0, < 0.7.0
1919
- fsspec>=0.7.4, <2021.6.0
2020
- gcsfs>=0.6.0
2121
- geopandas

ci/deps/azure-windows-38.yaml

+1-1
Original file line numberDiff line numberDiff line change
@@ -15,7 +15,7 @@ dependencies:
1515
# pandas dependencies
1616
- blosc
1717
- bottleneck
18-
- fastparquet>=0.4.0
18+
- fastparquet>=0.4.0, <0.7.0
1919
- flask
2020
- fsspec>=0.8.0, <2021.6.0
2121
- matplotlib=3.3.2

doc/source/development/contributing_environment.rst

+1-1
Original file line numberDiff line numberDiff line change
@@ -72,7 +72,7 @@ These packages will automatically be installed by using the ``pandas``
7272

7373
**Windows**
7474

75-
You will need `Build Tools for Visual Studio 2017
75+
You will need `Build Tools for Visual Studio 2019
7676
<https://visualstudio.microsoft.com/downloads/>`_.
7777

7878
.. warning::

doc/source/ecosystem.rst

+6
Original file line numberDiff line numberDiff line change
@@ -445,6 +445,12 @@ provides a familiar ``DataFrame`` interface for out-of-core, parallel and distri
445445

446446
Dask-ML enables parallel and distributed machine learning using Dask alongside existing machine learning libraries like Scikit-Learn, XGBoost, and TensorFlow.
447447

448+
`Ibis <https://ibis-project.org/docs/>`__
449+
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
450+
451+
Ibis offers a standard way to write analytics code, that can be run in multiple engines. It helps in bridging the gap between local Python environments (like pandas) and remote storage and execution systems like Hadoop components (like HDFS, Impala, Hive, Spark) and SQL databases (Postgres, etc.).
452+
453+
448454
`Koalas <https://koalas.readthedocs.io/en/latest/>`__
449455
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
450456

doc/source/getting_started/tutorials.rst

+13
Original file line numberDiff line numberDiff line change
@@ -18,6 +18,19 @@ entails.
1818
For the table of contents, see the `pandas-cookbook GitHub
1919
repository <https://github.com/jvns/pandas-cookbook>`_.
2020

21+
pandas workshop by Stefanie Molin
22+
---------------------------------
23+
24+
An introductory workshop by `Stefanie Molin <https://github.com/stefmolin>`_
25+
designed to quickly get you up to speed with pandas using real-world datasets.
26+
It covers getting started with pandas, data wrangling, and data visualization
27+
(with some exposure to matplotlib and seaborn). The
28+
`pandas-workshop GitHub repository <https://github.com/stefmolin/pandas-workshop>`_
29+
features detailed environment setup instructions (including a Binder environment),
30+
slides and notebooks for following along, and exercises to practice the concepts.
31+
There is also a lab with new exercises on a dataset not covered in the workshop for
32+
additional practice.
33+
2134
Learn pandas by Hernan Rojas
2235
----------------------------
2336

doc/source/user_guide/boolean.rst

+5
Original file line numberDiff line numberDiff line change
@@ -12,6 +12,11 @@
1212
Nullable Boolean data type
1313
**************************
1414

15+
.. note::
16+
17+
BooleanArray is currently experimental. Its API or implementation may
18+
change without warning.
19+
1520
.. versionadded:: 1.0.0
1621

1722

doc/source/user_guide/categorical.rst

+4-4
Original file line numberDiff line numberDiff line change
@@ -777,8 +777,8 @@ value is included in the ``categories``:
777777
df
778778
try:
779779
df.iloc[2:4, :] = [["c", 3], ["c", 3]]
780-
except ValueError as e:
781-
print("ValueError:", str(e))
780+
except TypeError as e:
781+
print("TypeError:", str(e))
782782
783783
Setting values by assigning categorical data will also check that the ``categories`` match:
784784

@@ -788,8 +788,8 @@ Setting values by assigning categorical data will also check that the ``categori
788788
df
789789
try:
790790
df.loc["j":"k", "cats"] = pd.Categorical(["b", "b"], categories=["a", "b", "c"])
791-
except ValueError as e:
792-
print("ValueError:", str(e))
791+
except TypeError as e:
792+
print("TypeError:", str(e))
793793
794794
Assigning a ``Categorical`` to parts of a column of other types will use the values:
795795

doc/source/user_guide/cookbook.rst

+1-1
Original file line numberDiff line numberDiff line change
@@ -1300,7 +1300,7 @@ is closed.
13001300

13011301
.. ipython:: python
13021302
1303-
store = pd.HDFStore("test.h5", "w", diver="H5FD_CORE")
1303+
store = pd.HDFStore("test.h5", "w", driver="H5FD_CORE")
13041304
13051305
df = pd.DataFrame(np.random.randn(8, 3))
13061306
store["test"] = df

0 commit comments

Comments
 (0)