Skip to content

Commit ca60804

Browse files
committed
Merge remote-tracking branch 'upstream/master' into bug/categorical-indexing-1row-df
* upstream/master: (185 commits) ENH: add BooleanArray extension array (pandas-dev#29555) DOC: Add link to dev calendar and meeting notes (pandas-dev#29737) ENH: Add built-in function for Styler to format the text displayed for missing values (pandas-dev#29118) DEPR: remove statsmodels/seaborn compat shims (pandas-dev#29822) DEPR: remove Index.summary (pandas-dev#29807) DEPR: passing an int to read_excel use_cols (pandas-dev#29795) STY: fstrings in io.pytables (pandas-dev#29758) BUG: Fix melt with mixed int/str columns (pandas-dev#29792) TST: add test for ffill/bfill for non unique multilevel (pandas-dev#29763) Changed description of parse_dates in read_excel(). (pandas-dev#29796) BUG: pivot_table not returning correct type when margin=True and aggfunc='mean' (pandas-dev#28248) REF: Create _lib/window directory (pandas-dev#29817) Fixed small mistake (pandas-dev#29815) minor cleanups (pandas-dev#29798) DEPR: enforce deprecations in core.internals (pandas-dev#29723) add test for unused level raises KeyError (pandas-dev#29760) Add documentation linking to sqlalchemy (pandas-dev#29373) io/parsers: ensure decimal is str on PythonParser (pandas-dev#29743) Reenabled no-unused-function (pandas-dev#29767) CLN:F-string in pandas/_libs/tslibs/*.pyx (pandas-dev#29775) ... # Conflicts: # pandas/tests/frame/indexing/test_indexing.py
2 parents 3e847e9 + 7d7f885 commit ca60804

File tree

522 files changed

+9473
-7183
lines changed

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

522 files changed

+9473
-7183
lines changed

.github/FUNDING.yml

+1
Original file line numberDiff line numberDiff line change
@@ -1,2 +1,3 @@
11
custom: https://pandas.pydata.org/donate.html
2+
github: [numfocus]
23
tidelift: pypi/pandas

.github/workflows/assign.yml

+15
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,15 @@
1+
name: Assign
2+
on:
3+
issue_comment:
4+
types: created
5+
6+
jobs:
7+
one:
8+
runs-on: ubuntu-latest
9+
steps:
10+
- name:
11+
run: |
12+
if [[ "${{ github.event.comment.body }}" == "take" ]]; then
13+
echo "Assigning issue ${{ github.event.issue.number }} to ${{ github.event.comment.user.login }}"
14+
curl -H "Authorization: token ${{ secrets.GITHUB_TOKEN }}" -d '{"assignees": ["${{ github.event.comment.user.login }}"]}' https://api.github.com/repos/${{ github.repository }}/issues/${{ github.event.issue.number }}/assignees
15+
fi

.github/workflows/ci.yml

+103
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,103 @@
1+
name: CI
2+
3+
on:
4+
push:
5+
branches: master
6+
pull_request:
7+
branches: master
8+
9+
env:
10+
ENV_FILE: environment.yml
11+
# TODO: remove export PATH=... in each step once this works
12+
# PATH: $HOME/miniconda3/bin:$PATH
13+
14+
jobs:
15+
checks:
16+
name: Checks
17+
runs-on: ubuntu-latest
18+
steps:
19+
20+
- name: Checkout
21+
uses: actions/checkout@v1
22+
23+
- name: Looking for unwanted patterns
24+
run: ci/code_checks.sh patterns
25+
if: true
26+
27+
- name: Setup environment and build pandas
28+
run: |
29+
export PATH=$HOME/miniconda3/bin:$PATH
30+
ci/setup_env.sh
31+
if: true
32+
33+
- name: Linting
34+
run: |
35+
export PATH=$HOME/miniconda3/bin:$PATH
36+
source activate pandas-dev
37+
ci/code_checks.sh lint
38+
if: true
39+
40+
- name: Dependencies consistency
41+
run: |
42+
export PATH=$HOME/miniconda3/bin:$PATH
43+
source activate pandas-dev
44+
ci/code_checks.sh dependencies
45+
if: true
46+
47+
- name: Checks on imported code
48+
run: |
49+
export PATH=$HOME/miniconda3/bin:$PATH
50+
source activate pandas-dev
51+
ci/code_checks.sh code
52+
if: true
53+
54+
- name: Running doctests
55+
run: |
56+
export PATH=$HOME/miniconda3/bin:$PATH
57+
source activate pandas-dev
58+
ci/code_checks.sh doctests
59+
if: true
60+
61+
- name: Docstring validation
62+
run: |
63+
export PATH=$HOME/miniconda3/bin:$PATH
64+
source activate pandas-dev
65+
ci/code_checks.sh docstrings
66+
if: true
67+
68+
- name: Typing validation
69+
run: |
70+
export PATH=$HOME/miniconda3/bin:$PATH
71+
source activate pandas-dev
72+
ci/code_checks.sh typing
73+
if: true
74+
75+
- name: Testing docstring validation script
76+
run: |
77+
export PATH=$HOME/miniconda3/bin:$PATH
78+
source activate pandas-dev
79+
pytest --capture=no --strict scripts
80+
if: true
81+
82+
- name: Running benchmarks
83+
run: |
84+
export PATH=$HOME/miniconda3/bin:$PATH
85+
source activate pandas-dev
86+
cd asv_bench
87+
asv check -E existing
88+
git remote add upstream https://github.com/pandas-dev/pandas.git
89+
git fetch upstream
90+
if git diff upstream/master --name-only | grep -q "^asv_bench/"; then
91+
asv machine --yes
92+
ASV_OUTPUT="$(asv dev)"
93+
if [[ $(echo "$ASV_OUTPUT" | grep "failed") ]]; then
94+
echo "##vso[task.logissue type=error]Benchmarks run with errors"
95+
echo "$ASV_OUTPUT"
96+
exit 1
97+
else
98+
echo "Benchmarks run without errors"
99+
fi
100+
else
101+
echo "Benchmarks did not run, no changes detected"
102+
fi
103+
if: true

.pre-commit-config.yaml

+2-2
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,6 @@
11
repos:
22
- repo: https://github.com/python/black
3-
rev: stable
3+
rev: 19.10b0
44
hooks:
55
- id: black
66
language_version: python3.7
@@ -9,7 +9,7 @@ repos:
99
hooks:
1010
- id: flake8
1111
language: python_venv
12-
additional_dependencies: [flake8-comprehensions]
12+
additional_dependencies: [flake8-comprehensions>=3.1.0]
1313
- repo: https://github.com/pre-commit/mirrors-isort
1414
rev: v4.3.20
1515
hooks:

.travis.yml

+3-14
Original file line numberDiff line numberDiff line change
@@ -30,11 +30,9 @@ matrix:
3030
- python: 3.5
3131

3232
include:
33-
- dist: bionic
34-
# 18.04
35-
python: 3.8.0
33+
- dist: trusty
3634
env:
37-
- JOB="3.8-dev" PATTERN="(not slow and not network)"
35+
- JOB="3.8" ENV_FILE="ci/deps/travis-38.yaml" PATTERN="(not slow and not network)"
3836

3937
- dist: trusty
4038
env:
@@ -85,19 +83,10 @@ install:
8583
- ci/submit_cython_cache.sh
8684
- echo "install done"
8785

88-
89-
before_script:
90-
# display server (for clipboard functionality) needs to be started here,
91-
# does not work if done in install:setup_env.sh (GH-26103)
92-
- export DISPLAY=":99.0"
93-
- echo "sh -e /etc/init.d/xvfb start"
94-
- if [ "$JOB" != "3.8-dev" ]; then sh -e /etc/init.d/xvfb start; fi
95-
- sleep 3
96-
9786
script:
9887
- echo "script start"
9988
- echo "$JOB"
100-
- if [ "$JOB" != "3.8-dev" ]; then source activate pandas-dev; fi
89+
- source activate pandas-dev
10190
- ci/run_tests.sh
10291

10392
after_script:

Makefile

+1-1
Original file line numberDiff line numberDiff line change
@@ -15,7 +15,7 @@ lint-diff:
1515
git diff upstream/master --name-only -- "*.py" | xargs flake8
1616

1717
black:
18-
black . --exclude '(asv_bench/env|\.egg|\.git|\.hg|\.mypy_cache|\.nox|\.tox|\.venv|_build|buck-out|build|dist|setup.py)'
18+
black .
1919

2020
develop: build
2121
python -m pip install --no-build-isolation -e .

asv_bench/benchmarks/categoricals.py

+6-6
Original file line numberDiff line numberDiff line change
@@ -84,7 +84,7 @@ class ValueCounts:
8484

8585
def setup(self, dropna):
8686
n = 5 * 10 ** 5
87-
arr = ["s{:04d}".format(i) for i in np.random.randint(0, n // 10, size=n)]
87+
arr = [f"s{i:04d}" for i in np.random.randint(0, n // 10, size=n)]
8888
self.ts = pd.Series(arr).astype("category")
8989

9090
def time_value_counts(self, dropna):
@@ -102,7 +102,7 @@ def time_rendering(self):
102102
class SetCategories:
103103
def setup(self):
104104
n = 5 * 10 ** 5
105-
arr = ["s{:04d}".format(i) for i in np.random.randint(0, n // 10, size=n)]
105+
arr = [f"s{i:04d}" for i in np.random.randint(0, n // 10, size=n)]
106106
self.ts = pd.Series(arr).astype("category")
107107

108108
def time_set_categories(self):
@@ -112,7 +112,7 @@ def time_set_categories(self):
112112
class RemoveCategories:
113113
def setup(self):
114114
n = 5 * 10 ** 5
115-
arr = ["s{:04d}".format(i) for i in np.random.randint(0, n // 10, size=n)]
115+
arr = [f"s{i:04d}" for i in np.random.randint(0, n // 10, size=n)]
116116
self.ts = pd.Series(arr).astype("category")
117117

118118
def time_remove_categories(self):
@@ -164,9 +164,9 @@ def setup(self, dtype):
164164
np.random.seed(1234)
165165
n = 5 * 10 ** 5
166166
sample_size = 100
167-
arr = [i for i in np.random.randint(0, n // 10, size=n)]
167+
arr = list(np.random.randint(0, n // 10, size=n))
168168
if dtype == "object":
169-
arr = ["s{:04d}".format(i) for i in arr]
169+
arr = [f"s{i:04d}" for i in arr]
170170
self.sample = np.random.choice(arr, sample_size)
171171
self.series = pd.Series(arr).astype("category")
172172

@@ -225,7 +225,7 @@ def setup(self, index):
225225
elif index == "non_monotonic":
226226
self.data = pd.Categorical.from_codes([0, 1, 2] * N, categories=categories)
227227
else:
228-
raise ValueError("Invalid index param: {}".format(index))
228+
raise ValueError(f"Invalid index param: {index}")
229229

230230
self.scalar = 10000
231231
self.list = list(range(10000))

asv_bench/benchmarks/frame_ctor.py

+1-1
Original file line numberDiff line numberDiff line change
@@ -99,7 +99,7 @@ class FromLists:
9999
def setup(self):
100100
N = 1000
101101
M = 100
102-
self.data = [[j for j in range(M)] for i in range(N)]
102+
self.data = [list(range(M)) for i in range(N)]
103103

104104
def time_frame_from_lists(self):
105105
self.df = DataFrame(self.data)

asv_bench/benchmarks/gil.py

+3-5
Original file line numberDiff line numberDiff line change
@@ -37,7 +37,7 @@ def wrapper(fname):
3737
return wrapper
3838

3939

40-
from .pandas_vb_common import BaseIO # noqa: E402 isort:skip
40+
from .pandas_vb_common import BaseIO # isort:skip
4141

4242

4343
class ParallelGroupbyMethods:
@@ -250,13 +250,11 @@ def setup(self, dtype):
250250
np.random.randn(rows, cols), index=date_range("1/1/2000", periods=rows)
251251
),
252252
"object": DataFrame(
253-
"foo",
254-
index=range(rows),
255-
columns=["object%03d".format(i) for i in range(5)],
253+
"foo", index=range(rows), columns=["object%03d" for _ in range(5)]
256254
),
257255
}
258256

259-
self.fname = "__test_{}__.csv".format(dtype)
257+
self.fname = f"__test_{dtype}__.csv"
260258
df = data[dtype]
261259
df.to_csv(self.fname)
262260

asv_bench/benchmarks/index_object.py

+1-1
Original file line numberDiff line numberDiff line change
@@ -146,7 +146,7 @@ class Indexing:
146146

147147
def setup(self, dtype):
148148
N = 10 ** 6
149-
self.idx = getattr(tm, "make{}Index".format(dtype))(N)
149+
self.idx = getattr(tm, f"make{dtype}Index")(N)
150150
self.array_mask = (np.arange(N) % 3) == 0
151151
self.series_mask = Series(self.array_mask)
152152
self.sorted = self.idx.sort_values()

asv_bench/benchmarks/io/csv.py

+5-7
Original file line numberDiff line numberDiff line change
@@ -132,7 +132,7 @@ class ReadCSVConcatDatetimeBadDateValue(StringIORewind):
132132
param_names = ["bad_date_value"]
133133

134134
def setup(self, bad_date_value):
135-
self.StringIO_input = StringIO(("%s,\n" % bad_date_value) * 50000)
135+
self.StringIO_input = StringIO((f"{bad_date_value},\n") * 50000)
136136

137137
def time_read_csv(self, bad_date_value):
138138
read_csv(
@@ -202,7 +202,7 @@ def setup(self, sep, thousands):
202202
data = np.random.randn(N, K) * np.random.randint(100, 10000, (N, K))
203203
df = DataFrame(data)
204204
if thousands is not None:
205-
fmt = ":{}".format(thousands)
205+
fmt = f":{thousands}"
206206
fmt = "{" + fmt + "}"
207207
df = df.applymap(lambda x: fmt.format(x))
208208
df.to_csv(self.fname, sep=sep)
@@ -231,7 +231,7 @@ def setup(self, sep, decimal, float_precision):
231231
floats = [
232232
"".join(random.choice(string.digits) for _ in range(28)) for _ in range(15)
233233
]
234-
rows = sep.join(["0{}".format(decimal) + "{}"] * 3) + "\n"
234+
rows = sep.join([f"0{decimal}" + "{}"] * 3) + "\n"
235235
data = rows * 5
236236
data = data.format(*floats) * 200 # 1000 x 3 strings csv
237237
self.StringIO_input = StringIO(data)
@@ -309,9 +309,7 @@ class ReadCSVCachedParseDates(StringIORewind):
309309
param_names = ["do_cache"]
310310

311311
def setup(self, do_cache):
312-
data = (
313-
"\n".join("10/{}".format(year) for year in range(2000, 2100)) + "\n"
314-
) * 10
312+
data = ("\n".join(f"10/{year}" for year in range(2000, 2100)) + "\n") * 10
315313
self.StringIO_input = StringIO(data)
316314

317315
def time_read_csv_cached(self, do_cache):
@@ -336,7 +334,7 @@ class ReadCSVMemoryGrowth(BaseIO):
336334
def setup(self):
337335
with open(self.fname, "w") as f:
338336
for i in range(self.num_rows):
339-
f.write("{i}\n".format(i=i))
337+
f.write(f"{i}\n")
340338

341339
def mem_parser_chunks(self):
342340
# see gh-24805.

asv_bench/benchmarks/io/excel.py

+1-1
Original file line numberDiff line numberDiff line change
@@ -14,7 +14,7 @@ def _generate_dataframe():
1414
C = 5
1515
df = DataFrame(
1616
np.random.randn(N, C),
17-
columns=["float{}".format(i) for i in range(C)],
17+
columns=[f"float{i}" for i in range(C)],
1818
index=date_range("20000101", periods=N, freq="H"),
1919
)
2020
df["object"] = tm.makeStringIndex(N)

asv_bench/benchmarks/io/hdf.py

+1-1
Original file line numberDiff line numberDiff line change
@@ -115,7 +115,7 @@ def setup(self, format):
115115
C = 5
116116
self.df = DataFrame(
117117
np.random.randn(N, C),
118-
columns=["float{}".format(i) for i in range(C)],
118+
columns=[f"float{i}" for i in range(C)],
119119
index=date_range("20000101", periods=N, freq="H"),
120120
)
121121
self.df["object"] = tm.makeStringIndex(N)

asv_bench/benchmarks/io/json.py

+2-2
Original file line numberDiff line numberDiff line change
@@ -20,7 +20,7 @@ def setup(self, orient, index):
2020
}
2121
df = DataFrame(
2222
np.random.randn(N, 5),
23-
columns=["float_{}".format(i) for i in range(5)],
23+
columns=[f"float_{i}" for i in range(5)],
2424
index=indexes[index],
2525
)
2626
df.to_json(self.fname, orient=orient)
@@ -43,7 +43,7 @@ def setup(self, index):
4343
}
4444
df = DataFrame(
4545
np.random.randn(N, 5),
46-
columns=["float_{}".format(i) for i in range(5)],
46+
columns=[f"float_{i}" for i in range(5)],
4747
index=indexes[index],
4848
)
4949
df.to_json(self.fname, orient="records", lines=True)

asv_bench/benchmarks/io/msgpack.py

+1-1
Original file line numberDiff line numberDiff line change
@@ -15,7 +15,7 @@ def setup(self):
1515
C = 5
1616
self.df = DataFrame(
1717
np.random.randn(N, C),
18-
columns=["float{}".format(i) for i in range(C)],
18+
columns=[f"float{i}" for i in range(C)],
1919
index=date_range("20000101", periods=N, freq="H"),
2020
)
2121
self.df["object"] = tm.makeStringIndex(N)

asv_bench/benchmarks/io/pickle.py

+1-1
Original file line numberDiff line numberDiff line change
@@ -13,7 +13,7 @@ def setup(self):
1313
C = 5
1414
self.df = DataFrame(
1515
np.random.randn(N, C),
16-
columns=["float{}".format(i) for i in range(C)],
16+
columns=[f"float{i}" for i in range(C)],
1717
index=date_range("20000101", periods=N, freq="H"),
1818
)
1919
self.df["object"] = tm.makeStringIndex(N)

asv_bench/benchmarks/io/sql.py

+2-2
Original file line numberDiff line numberDiff line change
@@ -19,7 +19,7 @@ def setup(self, connection):
1919
"sqlite": sqlite3.connect(":memory:"),
2020
}
2121
self.table_name = "test_type"
22-
self.query_all = "SELECT * FROM {}".format(self.table_name)
22+
self.query_all = f"SELECT * FROM {self.table_name}"
2323
self.con = con[connection]
2424
self.df = DataFrame(
2525
{
@@ -58,7 +58,7 @@ def setup(self, connection, dtype):
5858
"sqlite": sqlite3.connect(":memory:"),
5959
}
6060
self.table_name = "test_type"
61-
self.query_col = "SELECT {} FROM {}".format(dtype, self.table_name)
61+
self.query_col = f"SELECT {dtype} FROM {self.table_name}"
6262
self.con = con[connection]
6363
self.df = DataFrame(
6464
{

0 commit comments

Comments
 (0)