Skip to content

Commit 756abdb

Browse files
author
Mike Phung
authored
Merge branch 'pandas-dev:master' into fillna-other-missing-values-not-modified
2 parents 0af57f1 + 6636b36 commit 756abdb

File tree

278 files changed

+7869
-4314
lines changed

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

278 files changed

+7869
-4314
lines changed

.circleci/config.yml

Lines changed: 21 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,21 @@
1+
version: 2.1
2+
3+
jobs:
4+
test-arm:
5+
machine:
6+
image: ubuntu-2004:202101-01
7+
resource_class: arm.medium
8+
environment:
9+
ENV_FILE: ci/deps/circle-38-arm64.yaml
10+
PYTEST_WORKERS: auto
11+
PATTERN: "not slow and not network and not clipboard and not arm_slow"
12+
PYTEST_TARGET: "pandas"
13+
steps:
14+
- checkout
15+
- run: ci/setup_env.sh
16+
- run: PATH=$HOME/miniconda3/envs/pandas-dev/bin:$HOME/miniconda3/condabin:$PATH ci/run_tests.sh
17+
18+
workflows:
19+
test:
20+
jobs:
21+
- test-arm

.github/PULL_REQUEST_TEMPLATE.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,4 +1,4 @@
11
- [ ] closes #xxxx
22
- [ ] tests added / passed
3-
- [ ] Ensure all linting tests pass, see [here](https://pandas.pydata.org/pandas-docs/dev/development/contributing.html#code-standards) for how to run them
3+
- [ ] Ensure all linting tests pass, see [here](https://pandas.pydata.org/pandas-docs/dev/development/contributing_codebase.html#pre-commit) for how to run them
44
- [ ] whatsnew entry

.github/workflows/ci.yml

Lines changed: 10 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -45,6 +45,15 @@ jobs:
4545
environment-file: ${{ env.ENV_FILE }}
4646
use-only-tar-bz2: true
4747

48+
- name: Install node.js (for pyright)
49+
uses: actions/setup-node@v2
50+
with:
51+
node-version: "16"
52+
53+
- name: Install pyright
54+
# note: keep version in sync with .pre-commit-config.yaml
55+
run: npm install -g [email protected]
56+
4857
- name: Build Pandas
4958
uses: ./.github/actions/build_pandas
5059

@@ -168,6 +177,7 @@ jobs:
168177
PANDAS_DATA_MANAGER: array
169178
PATTERN: ${{ matrix.pattern }}
170179
PYTEST_WORKERS: "auto"
180+
PYTEST_TARGET: pandas
171181
run: |
172182
source activate pandas-dev
173183
ci/run_tests.sh

.github/workflows/database.yml

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -4,6 +4,7 @@ on:
44
push:
55
branches:
66
- master
7+
- 1.3.x
78
pull_request:
89
branches:
910
- master
@@ -79,7 +80,7 @@ jobs:
7980
- uses: conda-incubator/setup-miniconda@v2
8081
with:
8182
activate-environment: pandas-dev
82-
channel-priority: flexible
83+
channel-priority: strict
8384
environment-file: ${{ matrix.ENV_FILE }}
8485
use-only-tar-bz2: true
8586

.github/workflows/posix.yml

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -44,6 +44,7 @@ jobs:
4444
LC_ALL: ${{ matrix.settings[4] }}
4545
PANDAS_TESTING_MODE: ${{ matrix.settings[5] }}
4646
TEST_ARGS: ${{ matrix.settings[6] }}
47+
PYTEST_TARGET: pandas
4748
concurrency:
4849
group: ${{ github.ref }}-${{ matrix.settings[0] }}
4950
cancel-in-progress: ${{github.event_name == 'pull_request'}}

.github/workflows/pre-commit.yml

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -13,6 +13,8 @@ jobs:
1313
concurrency:
1414
group: ${{ github.ref }}-pre-commit
1515
cancel-in-progress: ${{github.event_name == 'pull_request'}}
16+
env:
17+
SKIP: pyright
1618
steps:
1719
- uses: actions/checkout@v2
1820
- uses: actions/setup-python@v2

.github/workflows/python-dev.yml

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -17,6 +17,7 @@ env:
1717
PANDAS_CI: 1
1818
PATTERN: "not slow and not network and not clipboard"
1919
COVERAGE: true
20+
PYTEST_TARGET: pandas
2021

2122
jobs:
2223
build:

.pre-commit-config.yaml

Lines changed: 10 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -81,6 +81,16 @@ repos:
8181
- flake8-comprehensions==3.1.0
8282
- flake8-bugbear==21.3.2
8383
- pandas-dev-flaker==0.2.0
84+
- repo: local
85+
hooks:
86+
- id: pyright
87+
name: pyright
88+
entry: pyright
89+
language: node
90+
pass_filenames: false
91+
types: [python]
92+
# note: keep version in sync with .github/workflows/ci.yml
93+
additional_dependencies: ['[email protected]']
8494
- repo: local
8595
hooks:
8696
- id: flake8-rst

README.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -12,7 +12,7 @@
1212
[![License](https://img.shields.io/pypi/l/pandas.svg)](https://github.com/pandas-dev/pandas/blob/master/LICENSE)
1313
[![Azure Build Status](https://dev.azure.com/pandas-dev/pandas/_apis/build/status/pandas-dev.pandas?branch=master)](https://dev.azure.com/pandas-dev/pandas/_build/latest?definitionId=1&branch=master)
1414
[![Coverage](https://codecov.io/github/pandas-dev/pandas/coverage.svg?branch=master)](https://codecov.io/gh/pandas-dev/pandas)
15-
[![Downloads](https://anaconda.org/conda-forge/pandas/badges/downloads.svg)](https://pandas.pydata.org)
15+
[![Downloads](https://static.pepy.tech/personalized-badge/pandas?period=month&units=international_system&left_color=black&right_color=orange&left_text=PyPI%20downloads%20per%20month)](https://pepy.tech/project/pandas)
1616
[![Gitter](https://badges.gitter.im/Join%20Chat.svg)](https://gitter.im/pydata/pandas)
1717
[![Powered by NumFOCUS](https://img.shields.io/badge/powered%20by-NumFOCUS-orange.svg?style=flat&colorA=E1523D&colorB=007D8A)](https://numfocus.org)
1818
[![Code style: black](https://img.shields.io/badge/code%20style-black-000000.svg)](https://github.com/psf/black)

asv_bench/benchmarks/algorithms.py

Lines changed: 6 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -44,9 +44,9 @@ def setup(self, unique, sort, dtype):
4444
raise NotImplementedError
4545

4646
data = {
47-
"int": pd.Int64Index(np.arange(N)),
48-
"uint": pd.UInt64Index(np.arange(N)),
49-
"float": pd.Float64Index(np.random.randn(N)),
47+
"int": pd.Index(np.arange(N), dtype="int64"),
48+
"uint": pd.Index(np.arange(N), dtype="uint64"),
49+
"float": pd.Index(np.random.randn(N), dtype="float64"),
5050
"object": string_index,
5151
"datetime64[ns]": pd.date_range("2011-01-01", freq="H", periods=N),
5252
"datetime64[ns, tz]": pd.date_range(
@@ -76,9 +76,9 @@ class Duplicated:
7676
def setup(self, unique, keep, dtype):
7777
N = 10 ** 5
7878
data = {
79-
"int": pd.Int64Index(np.arange(N)),
80-
"uint": pd.UInt64Index(np.arange(N)),
81-
"float": pd.Float64Index(np.random.randn(N)),
79+
"int": pd.Index(np.arange(N), dtype="int64"),
80+
"uint": pd.Index(np.arange(N), dtype="uint64"),
81+
"float": pd.Index(np.random.randn(N), dtype="float64"),
8282
"string": tm.makeStringIndex(N),
8383
"datetime64[ns]": pd.date_range("2011-01-01", freq="H", periods=N),
8484
"datetime64[ns, tz]": pd.date_range(

asv_bench/benchmarks/groupby.py

Lines changed: 43 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -454,6 +454,16 @@ def setup(self, dtype, method, application, ncols):
454454
# DataFrameGroupBy doesn't have these methods
455455
raise NotImplementedError
456456

457+
if application == "transformation" and method in [
458+
"head",
459+
"tail",
460+
"unique",
461+
"value_counts",
462+
"size",
463+
]:
464+
# DataFrameGroupBy doesn't have these methods
465+
raise NotImplementedError
466+
457467
ngroups = 1000
458468
size = ngroups * 2
459469
rng = np.arange(ngroups).reshape(-1, 1)
@@ -480,7 +490,7 @@ def setup(self, dtype, method, application, ncols):
480490
if len(cols) == 1:
481491
cols = cols[0]
482492

483-
if application == "transform":
493+
if application == "transformation":
484494
if method == "describe":
485495
raise NotImplementedError
486496

@@ -593,6 +603,38 @@ def time_sum(self):
593603
self.df.groupby(["a"])["b"].sum()
594604

595605

606+
class String:
607+
# GH#41596
608+
param_names = ["dtype", "method"]
609+
params = [
610+
["str", "string[python]"],
611+
[
612+
"sum",
613+
"prod",
614+
"min",
615+
"max",
616+
"mean",
617+
"median",
618+
"var",
619+
"first",
620+
"last",
621+
"any",
622+
"all",
623+
],
624+
]
625+
626+
def setup(self, dtype, method):
627+
cols = list("abcdefghjkl")
628+
self.df = DataFrame(
629+
np.random.randint(0, 100, size=(1_000_000, len(cols))),
630+
columns=cols,
631+
dtype=dtype,
632+
)
633+
634+
def time_str_func(self, dtype, method):
635+
self.df.groupby("a")[self.df.columns[1:]].agg(method)
636+
637+
596638
class Categories:
597639
def setup(self):
598640
N = 10 ** 5

asv_bench/benchmarks/indexing_engines.py

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -48,7 +48,7 @@ def setup(self, engine_and_dtype, index_type):
4848
"non_monotonic": np.array([1, 2, 3] * N, dtype=dtype),
4949
}[index_type]
5050

51-
self.data = engine(lambda: arr, len(arr))
51+
self.data = engine(arr)
5252
# code belows avoids populating the mapping etc. while timing.
5353
self.data.get_loc(2)
5454

@@ -70,7 +70,7 @@ def setup(self, index_type):
7070
"non_monotonic": np.array(list("abc") * N, dtype=object),
7171
}[index_type]
7272

73-
self.data = libindex.ObjectEngine(lambda: arr, len(arr))
73+
self.data = libindex.ObjectEngine(arr)
7474
# code belows avoids populating the mapping etc. while timing.
7575
self.data.get_loc("b")
7676

asv_bench/benchmarks/inference.py

Lines changed: 5 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -173,6 +173,7 @@ def setup(self):
173173
self.strings_tz_space = [
174174
x.strftime("%Y-%m-%d %H:%M:%S") + " -0800" for x in rng
175175
]
176+
self.strings_zero_tz = [x.strftime("%Y-%m-%d %H:%M:%S") + "Z" for x in rng]
176177

177178
def time_iso8601(self):
178179
to_datetime(self.strings)
@@ -189,6 +190,10 @@ def time_iso8601_format_no_sep(self):
189190
def time_iso8601_tz_spaceformat(self):
190191
to_datetime(self.strings_tz_space)
191192

193+
def time_iso8601_infer_zero_tz_fromat(self):
194+
# GH 41047
195+
to_datetime(self.strings_zero_tz, infer_datetime_format=True)
196+
192197

193198
class ToDatetimeNONISO8601:
194199
def setup(self):

asv_bench/benchmarks/io/style.py

Lines changed: 28 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -34,6 +34,14 @@ def peakmem_classes_render(self, cols, rows):
3434
self._style_classes()
3535
self.st._render_html(True, True)
3636

37+
def time_tooltips_render(self, cols, rows):
38+
self._style_tooltips()
39+
self.st._render_html(True, True)
40+
41+
def peakmem_tooltips_render(self, cols, rows):
42+
self._style_tooltips()
43+
self.st._render_html(True, True)
44+
3745
def time_format_render(self, cols, rows):
3846
self._style_format()
3947
self.st._render_html(True, True)
@@ -42,6 +50,14 @@ def peakmem_format_render(self, cols, rows):
4250
self._style_format()
4351
self.st._render_html(True, True)
4452

53+
def time_apply_format_hide_render(self, cols, rows):
54+
self._style_apply_format_hide()
55+
self.st._render_html(True, True)
56+
57+
def peakmem_apply_format_hide_render(self, cols, rows):
58+
self._style_apply_format_hide()
59+
self.st._render_html(True, True)
60+
4561
def _style_apply(self):
4662
def _apply_func(s):
4763
return [
@@ -63,3 +79,15 @@ def _style_format(self):
6379
self.st = self.df.style.format(
6480
"{:,.3f}", subset=IndexSlice["row_1":f"row_{ir}", "float_1":f"float_{ic}"]
6581
)
82+
83+
def _style_apply_format_hide(self):
84+
self.st = self.df.style.applymap(lambda v: "color: red;")
85+
self.st.format("{:.3f}")
86+
self.st.hide_index(self.st.index[1:])
87+
self.st.hide_columns(self.st.columns[1:])
88+
89+
def _style_tooltips(self):
90+
ttips = DataFrame("abc", index=self.df.index[::2], columns=self.df.columns[::2])
91+
self.st = self.df.style.set_tooltips(ttips)
92+
self.st.hide_index(self.st.index[12:])
93+
self.st.hide_columns(self.st.columns[12:])

asv_bench/benchmarks/rolling.py

Lines changed: 27 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -180,6 +180,33 @@ def time_quantile(self, constructor, window, dtype, percentile, interpolation):
180180
self.roll.quantile(percentile, interpolation=interpolation)
181181

182182

183+
class Rank:
184+
params = (
185+
["DataFrame", "Series"],
186+
[10, 1000],
187+
["int", "float"],
188+
[True, False],
189+
[True, False],
190+
["min", "max", "average"],
191+
)
192+
param_names = [
193+
"constructor",
194+
"window",
195+
"dtype",
196+
"percentile",
197+
"ascending",
198+
"method",
199+
]
200+
201+
def setup(self, constructor, window, dtype, percentile, ascending, method):
202+
N = 10 ** 5
203+
arr = np.random.random(N).astype(dtype)
204+
self.roll = getattr(pd, constructor)(arr).rolling(window)
205+
206+
def time_rank(self, constructor, window, dtype, percentile, ascending, method):
207+
self.roll.rank(pct=percentile, ascending=ascending, method=method)
208+
209+
183210
class PeakMemFixedWindowMinMax:
184211

185212
params = ["min", "max"]

asv_bench/benchmarks/series_methods.py

Lines changed: 13 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -28,6 +28,19 @@ def time_constructor(self, data):
2828
Series(data=self.data, index=self.idx)
2929

3030

31+
class ToFrame:
32+
params = [["int64", "datetime64[ns]", "category", "Int64"], [None, "foo"]]
33+
param_names = ["dtype", "name"]
34+
35+
def setup(self, dtype, name):
36+
arr = np.arange(10 ** 5)
37+
ser = Series(arr, dtype=dtype)
38+
self.ser = ser
39+
40+
def time_to_frame(self, dtype, name):
41+
self.ser.to_frame(name)
42+
43+
3144
class NSort:
3245

3346
params = ["first", "last", "all"]

0 commit comments

Comments
 (0)