Skip to content

Commit 13b1531

Browse files
committed
merge master
2 parents 9e23d20 + 5fdf642 commit 13b1531

File tree

855 files changed

+48670
-39595
lines changed

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

855 files changed

+48670
-39595
lines changed
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,33 @@
1+
name: "Update pre-commit config"
2+
3+
on:
4+
schedule:
5+
- cron: "0 7 * * 1" # At 07:00 on each Monday.
6+
workflow_dispatch:
7+
8+
jobs:
9+
update-pre-commit:
10+
if: github.repository_owner == 'pandas-dev'
11+
name: Autoupdate pre-commit config
12+
runs-on: ubuntu-latest
13+
steps:
14+
- name: Set up Python
15+
uses: actions/setup-python@v2
16+
- name: Cache multiple paths
17+
uses: actions/cache@v2
18+
with:
19+
path: |
20+
~/.cache/pre-commit
21+
~/.cache/pip
22+
key: pre-commit-autoupdate-${{ runner.os }}-build
23+
- name: Update pre-commit config packages
24+
uses: technote-space/create-pr-action@v2
25+
with:
26+
GITHUB_TOKEN: ${{ secrets.ACTION_TRIGGER_TOKEN }}
27+
EXECUTE_COMMANDS: |
28+
pip install pre-commit
29+
pre-commit autoupdate || (exit 0);
30+
pre-commit run -a || (exit 0);
31+
COMMIT_MESSAGE: "⬆️ UPGRADE: Autoupdate pre-commit config"
32+
PR_BRANCH_NAME: "pre-commit-config-update-${PR_ID}"
33+
PR_TITLE: "⬆️ UPGRADE: Autoupdate pre-commit config"

.github/workflows/ci.yml

+2-8
Original file line numberDiff line numberDiff line change
@@ -18,7 +18,7 @@ jobs:
1818
steps:
1919

2020
- name: Setting conda path
21-
run: echo "::add-path::${HOME}/miniconda3/bin"
21+
run: echo "${HOME}/miniconda3/bin" >> $GITHUB_PATH
2222

2323
- name: Checkout
2424
uses: actions/checkout@v1
@@ -37,12 +37,6 @@ jobs:
3737
ci/code_checks.sh lint
3838
if: always()
3939

40-
- name: Dependencies consistency
41-
run: |
42-
source activate pandas-dev
43-
ci/code_checks.sh dependencies
44-
if: always()
45-
4640
- name: Checks on imported code
4741
run: |
4842
source activate pandas-dev
@@ -104,7 +98,7 @@ jobs:
10498
steps:
10599

106100
- name: Setting conda path
107-
run: echo "::set-env name=PATH::${HOME}/miniconda3/bin:${PATH}"
101+
run: echo "${HOME}/miniconda3/bin" >> $GITHUB_PATH
108102

109103
- name: Checkout
110104
uses: actions/checkout@v1

.gitignore

+1
Original file line numberDiff line numberDiff line change
@@ -12,6 +12,7 @@
1212
*.log
1313
*.swp
1414
*.pdb
15+
*.zip
1516
.project
1617
.pydevproject
1718
.settings

.pre-commit-config.yaml

+112-8
Original file line numberDiff line numberDiff line change
@@ -15,33 +15,36 @@ repos:
1515
- id: flake8
1616
name: flake8 (cython template)
1717
files: \.pxi\.in$
18-
types:
19-
- file
18+
types: [text]
2019
args: [--append-config=flake8/cython-template.cfg]
2120
- repo: https://github.com/PyCQA/isort
22-
rev: 5.6.3
21+
rev: 5.6.4
2322
hooks:
2423
- id: isort
2524
name: isort (python)
2625
- id: isort
2726
name: isort (cython)
2827
types: [cython]
2928
- repo: https://github.com/asottile/pyupgrade
30-
rev: v2.7.2
29+
rev: v2.7.4
3130
hooks:
3231
- id: pyupgrade
3332
args: [--py37-plus]
3433
- repo: https://github.com/pre-commit/pygrep-hooks
35-
rev: v1.6.0
34+
rev: v1.7.0
3635
hooks:
3736
- id: rst-backticks
37+
- id: rst-directive-colons
38+
types: [text]
39+
- id: rst-inline-touching-normal
40+
types: [text]
3841
- repo: local
3942
hooks:
4043
- id: pip_to_conda
4144
name: Generate pip dependency from conda
4245
description: This hook checks if the conda environment.yml and requirements-dev.txt are equal
4346
language: python
44-
entry: python -m scripts.generate_pip_deps_from_conda
47+
entry: python scripts/generate_pip_deps_from_conda.py
4548
files: ^(environment.yml|requirements-dev.txt)$
4649
pass_filenames: false
4750
additional_dependencies: [pyyaml]
@@ -53,12 +56,113 @@ repos:
5356
types: [rst]
5457
args: [--filename=*.rst]
5558
additional_dependencies: [flake8-rst==0.7.0, flake8==3.7.9]
59+
- id: non-standard-imports
60+
name: Check for non-standard imports
61+
language: pygrep
62+
entry: |
63+
(?x)
64+
# Check for imports from pandas.core.common instead of `import pandas.core.common as com`
65+
from\ pandas\.core\.common\ import|
66+
from\ pandas\.core\ import\ common|
67+
68+
# Check for imports from collections.abc instead of `from collections import abc`
69+
from\ collections\.abc\ import
70+
71+
- id: non-standard-numpy.random-related-imports
72+
name: Check for non-standard numpy.random-related imports excluding pandas/_testing.py
73+
language: pygrep
74+
exclude: pandas/_testing.py
75+
entry: |
76+
(?x)
77+
# Check for imports from np.random.<method> instead of `from numpy import random` or `from numpy.random import <method>`
78+
from\ numpy\ import\ random|
79+
from\ numpy.random\ import
80+
types: [python]
81+
- id: non-standard-imports-in-tests
82+
name: Check for non-standard imports in test suite
83+
language: pygrep
84+
entry: |
85+
(?x)
86+
# Check for imports from pandas._testing instead of `import pandas._testing as tm`
87+
from\ pandas\._testing\ import|
88+
from\ pandas\ import\ _testing\ as\ tm|
89+
90+
# No direct imports from conftest
91+
conftest\ import|
92+
import\ conftest
93+
types: [python]
94+
files: ^pandas/tests/
95+
- id: incorrect-code-directives
96+
name: Check for incorrect code block or IPython directives
97+
language: pygrep
98+
entry: (\.\. code-block ::|\.\. ipython ::)
99+
files: \.(py|pyx|rst)$
100+
- id: unwanted-patterns-strings-to-concatenate
101+
name: Check for use of not concatenated strings
102+
language: python
103+
entry: python scripts/validate_unwanted_patterns.py --validation-type="strings_to_concatenate"
104+
files: \.(py|pyx|pxd|pxi)$
105+
- id: unwanted-patterns-strings-with-wrong-placed-whitespace
106+
name: Check for strings with wrong placed spaces
107+
language: python
108+
entry: python scripts/validate_unwanted_patterns.py --validation-type="strings_with_wrong_placed_whitespace"
109+
files: \.(py|pyx|pxd|pxi)$
110+
- id: unwanted-patterns-private-import-across-module
111+
name: Check for import of private attributes across modules
112+
language: python
113+
entry: python scripts/validate_unwanted_patterns.py --validation-type="private_import_across_module"
114+
types: [python]
115+
exclude: ^(asv_bench|pandas/tests|doc)/
116+
- id: unwanted-patterns-private-function-across-module
117+
name: Check for use of private functions across modules
118+
language: python
119+
entry: python scripts/validate_unwanted_patterns.py --validation-type="private_function_across_module"
120+
types: [python]
121+
exclude: ^(asv_bench|pandas/tests|doc)/
122+
- id: inconsistent-namespace-usage
123+
name: 'Check for inconsistent use of pandas namespace in tests'
124+
entry: python scripts/check_for_inconsistent_pandas_namespace.py
125+
language: python
126+
types: [python]
127+
files: ^pandas/tests/
128+
- id: FrameOrSeriesUnion
129+
name: Check for use of Union[Series, DataFrame] instead of FrameOrSeriesUnion alias
130+
entry: Union\[.*(Series.*DataFrame|DataFrame.*Series).*\]
131+
language: pygrep
132+
types: [python]
133+
exclude: ^pandas/_typing\.py$
134+
- id: type-not-class
135+
name: Check for use of foo.__class__ instead of type(foo)
136+
entry: \.__class__
137+
language: pygrep
138+
files: \.(py|pyx)$
139+
- id: unwanted-typing
140+
name: Check for use of comment-based annotation syntax and missing error codes
141+
entry: |
142+
(?x)
143+
\#\ type:\ (?!ignore)|
144+
\#\ type:\s?ignore(?!\[)
145+
language: pygrep
146+
types: [python]
147+
- id: no-os-remove
148+
name: Check code for instances of os.remove
149+
entry: os\.remove
150+
language: pygrep
151+
types: [python]
152+
files: ^pandas/tests/
153+
exclude: |
154+
(?x)^
155+
pandas/tests/io/excel/test_writers\.py|
156+
pandas/tests/io/pytables/common\.py|
157+
pandas/tests/io/pytables/test_store\.py$
56158
- repo: https://github.com/asottile/yesqa
57159
rev: v1.2.2
58160
hooks:
59161
- id: yesqa
60162
- repo: https://github.com/pre-commit/pre-commit-hooks
61-
rev: v3.2.0
163+
rev: v3.3.0
62164
hooks:
63165
- id: end-of-file-fixer
64-
exclude: '.html$|^LICENSES/|.csv$|.txt$|.svg$|.py$'
166+
exclude: ^LICENSES/|\.(html|csv|txt|svg|py)$
167+
- id: trailing-whitespace
168+
exclude: \.(html|svg)$

.travis.yml

+1-6
Original file line numberDiff line numberDiff line change
@@ -35,11 +35,6 @@ matrix:
3535
fast_finish: true
3636

3737
include:
38-
- dist: bionic
39-
python: 3.9-dev
40-
env:
41-
- JOB="3.9-dev" PATTERN="(not slow and not network and not clipboard)"
42-
4338
- env:
4439
- JOB="3.8, slow" ENV_FILE="ci/deps/travis-38-slow.yaml" PATTERN="slow" SQL="1"
4540
services:
@@ -94,7 +89,7 @@ install:
9489
script:
9590
- echo "script start"
9691
- echo "$JOB"
97-
- if [ "$JOB" != "3.9-dev" ]; then source activate pandas-dev; fi
92+
- source activate pandas-dev
9893
- ci/run_tests.sh
9994

10095
after_script:

Dockerfile

+9-8
Original file line numberDiff line numberDiff line change
@@ -1,4 +1,4 @@
1-
FROM continuumio/miniconda3
1+
FROM quay.io/condaforge/miniforge3
22

33
# if you forked pandas, you can pass in your own GitHub username to use your fork
44
# i.e. gh_username=myname
@@ -15,10 +15,6 @@ RUN apt-get update \
1515
# Verify git, process tools, lsb-release (common in install instructions for CLIs) installed
1616
&& apt-get -y install git iproute2 procps iproute2 lsb-release \
1717
#
18-
# Install C compilers (gcc not enough, so just went with build-essential which admittedly might be overkill),
19-
# needed to build pandas C extensions
20-
&& apt-get -y install build-essential \
21-
#
2218
# cleanup
2319
&& apt-get autoremove -y \
2420
&& apt-get clean -y \
@@ -39,9 +35,14 @@ RUN mkdir "$pandas_home" \
3935
# we just update the base/root one from the 'environment.yml' file instead of creating a new one.
4036
#
4137
# Set up environment
42-
RUN conda env update -n base -f "$pandas_home/environment.yml"
38+
RUN conda install -y mamba
39+
RUN mamba env update -n base -f "$pandas_home/environment.yml"
4340

4441
# Build C extensions and pandas
45-
RUN cd "$pandas_home" \
46-
&& python setup.py build_ext --inplace -j 4 \
42+
SHELL ["/bin/bash", "-c"]
43+
RUN . /opt/conda/etc/profile.d/conda.sh \
44+
&& conda activate base \
45+
&& cd "$pandas_home" \
46+
&& export \
47+
&& python setup.py build_ext -j 4 \
4748
&& python -m pip install -e .

Makefile

+3-3
Original file line numberDiff line numberDiff line change
@@ -9,7 +9,7 @@ clean_pyc:
99
-find . -name '*.py[co]' -exec rm {} \;
1010

1111
build: clean_pyc
12-
python setup.py build_ext --inplace
12+
python setup.py build_ext
1313

1414
lint-diff:
1515
git diff upstream/master --name-only -- "*.py" | xargs flake8
@@ -30,11 +30,11 @@ check:
3030
python3 scripts/validate_unwanted_patterns.py \
3131
--validation-type="private_function_across_module" \
3232
--included-file-extensions="py" \
33-
--excluded-file-paths=pandas/tests,asv_bench/,pandas/_vendored \
33+
--excluded-file-paths=pandas/tests,asv_bench/ \
3434
pandas/
3535

3636
python3 scripts/validate_unwanted_patterns.py \
3737
--validation-type="private_import_across_module" \
3838
--included-file-extensions="py" \
39-
--excluded-file-paths=pandas/tests,asv_bench/,pandas/_vendored,doc/
39+
--excluded-file-paths=pandas/tests,asv_bench/,doc/
4040
pandas/

README.md

+22-22
Original file line numberDiff line numberDiff line change
@@ -60,27 +60,27 @@ Here are just a few of the things that pandas does well:
6060
and saving/loading data from the ultrafast [**HDF5 format**][hdfstore]
6161
- [**Time series**][timeseries]-specific functionality: date range
6262
generation and frequency conversion, moving window statistics,
63-
date shifting and lagging.
64-
65-
66-
[missing-data]: https://pandas.pydata.org/pandas-docs/stable/missing_data.html#working-with-missing-data
67-
[insertion-deletion]: https://pandas.pydata.org/pandas-docs/stable/dsintro.html#column-selection-addition-deletion
68-
[alignment]: https://pandas.pydata.org/pandas-docs/stable/dsintro.html?highlight=alignment#intro-to-data-structures
69-
[groupby]: https://pandas.pydata.org/pandas-docs/stable/groupby.html#group-by-split-apply-combine
70-
[conversion]: https://pandas.pydata.org/pandas-docs/stable/dsintro.html#dataframe
71-
[slicing]: https://pandas.pydata.org/pandas-docs/stable/indexing.html#slicing-ranges
72-
[fancy-indexing]: https://pandas.pydata.org/pandas-docs/stable/indexing.html#advanced-indexing-with-ix
73-
[subsetting]: https://pandas.pydata.org/pandas-docs/stable/indexing.html#boolean-indexing
74-
[merging]: https://pandas.pydata.org/pandas-docs/stable/merging.html#database-style-dataframe-joining-merging
75-
[joining]: https://pandas.pydata.org/pandas-docs/stable/merging.html#joining-on-index
76-
[reshape]: https://pandas.pydata.org/pandas-docs/stable/reshaping.html#reshaping-and-pivot-tables
77-
[pivot-table]: https://pandas.pydata.org/pandas-docs/stable/reshaping.html#pivot-tables-and-cross-tabulations
78-
[mi]: https://pandas.pydata.org/pandas-docs/stable/indexing.html#hierarchical-indexing-multiindex
79-
[flat-files]: https://pandas.pydata.org/pandas-docs/stable/io.html#csv-text-files
80-
[excel]: https://pandas.pydata.org/pandas-docs/stable/io.html#excel-files
81-
[db]: https://pandas.pydata.org/pandas-docs/stable/io.html#sql-queries
82-
[hdfstore]: https://pandas.pydata.org/pandas-docs/stable/io.html#hdf5-pytables
83-
[timeseries]: https://pandas.pydata.org/pandas-docs/stable/timeseries.html#time-series-date-functionality
63+
date shifting and lagging
64+
65+
66+
[missing-data]: https://pandas.pydata.org/pandas-docs/stable/user_guide/missing_data.html
67+
[insertion-deletion]: https://pandas.pydata.org/pandas-docs/stable/user_guide/dsintro.html#column-selection-addition-deletion
68+
[alignment]: https://pandas.pydata.org/pandas-docs/stable/user_guide/dsintro.html?highlight=alignment#intro-to-data-structures
69+
[groupby]: https://pandas.pydata.org/pandas-docs/stable/user_guide/groupby.html#group-by-split-apply-combine
70+
[conversion]: https://pandas.pydata.org/pandas-docs/stable/user_guide/dsintro.html#dataframe
71+
[slicing]: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#slicing-ranges
72+
[fancy-indexing]: https://pandas.pydata.org/pandas-docs/stable/user_guide/advanced.html#advanced
73+
[subsetting]: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#boolean-indexing
74+
[merging]: https://pandas.pydata.org/pandas-docs/stable/user_guide/merging.html#database-style-dataframe-or-named-series-joining-merging
75+
[joining]: https://pandas.pydata.org/pandas-docs/stable/user_guide/merging.html#joining-on-index
76+
[reshape]: https://pandas.pydata.org/pandas-docs/stable/user_guide/reshaping.html
77+
[pivot-table]: https://pandas.pydata.org/pandas-docs/stable/user_guide/reshaping.html
78+
[mi]: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#hierarchical-indexing-multiindex
79+
[flat-files]: https://pandas.pydata.org/pandas-docs/stable/user_guide/io.html#csv-text-files
80+
[excel]: https://pandas.pydata.org/pandas-docs/stable/user_guide/io.html#excel-files
81+
[db]: https://pandas.pydata.org/pandas-docs/stable/user_guide/io.html#sql-queries
82+
[hdfstore]: https://pandas.pydata.org/pandas-docs/stable/user_guide/io.html#hdf5-pytables
83+
[timeseries]: https://pandas.pydata.org/pandas-docs/stable/user_guide/timeseries.html#time-series-date-functionality
8484

8585
## Where to get it
8686
The source code is currently hosted on GitHub at:
@@ -154,7 +154,7 @@ For usage questions, the best place to go to is [StackOverflow](https://stackove
154154
Further, general questions and discussions can also take place on the [pydata mailing list](https://groups.google.com/forum/?fromgroups#!forum/pydata).
155155

156156
## Discussion and Development
157-
Most development discussions take place on github in this repo. Further, the [pandas-dev mailing list](https://mail.python.org/mailman/listinfo/pandas-dev) can also be used for specialized discussions or design issues, and a [Gitter channel](https://gitter.im/pydata/pandas) is available for quick development related questions.
157+
Most development discussions take place on GitHub in this repo. Further, the [pandas-dev mailing list](https://mail.python.org/mailman/listinfo/pandas-dev) can also be used for specialized discussions or design issues, and a [Gitter channel](https://gitter.im/pydata/pandas) is available for quick development related questions.
158158

159159
## Contributing to pandas [![Open Source Helpers](https://www.codetriage.com/pandas-dev/pandas/badges/users.svg)](https://www.codetriage.com/pandas-dev/pandas)
160160

asv_bench/benchmarks/algorithms.py

+12
Original file line numberDiff line numberDiff line change
@@ -5,6 +5,7 @@
55
from pandas._libs import lib
66

77
import pandas as pd
8+
from pandas.core.algorithms import make_duplicates_of_left_unique_in_right
89

910
from .pandas_vb_common import tm
1011

@@ -174,4 +175,15 @@ def time_argsort(self, N):
174175
self.array.argsort()
175176

176177

178+
class RemoveDuplicates:
179+
def setup(self):
180+
N = 10 ** 5
181+
na = np.arange(int(N / 2))
182+
self.left = np.concatenate([na[: int(N / 4)], na[: int(N / 4)]])
183+
self.right = np.concatenate([na, na])
184+
185+
def time_make_duplicates_of_left_unique_in_right(self):
186+
make_duplicates_of_left_unique_in_right(self.left, self.right)
187+
188+
177189
from .pandas_vb_common import setup # noqa: F401 isort:skip

0 commit comments

Comments
 (0)