Skip to content

Sync Fork from Upstream Repo #206

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 68 commits into from
Jun 15, 2021
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
68 commits
Select commit Hold shift + click to select a range
bcae446
TYP: use annotations in strprime.pyi and timestamps.pyi (#41841)
fangchenli Jun 8, 2021
a76132e
docstrings: typos, clarity (#41845)
stragu Jun 8, 2021
de07087
DOC/CLN versionadded etc for 1.3 -> 1.3.0 (#41852)
simonjayhawkins Jun 8, 2021
e4dedca
API: EA._can_hold_na -> EDtype.can_hold_na (#41654)
jbrockmendel Jun 8, 2021
672c468
DOC/BLD: update README.md, remove setuptools from dependency (#41818)
fangchenli Jun 8, 2021
9d77a56
[ArrowStringArray] API: StringDtype parameterized by storage (python …
simonjayhawkins Jun 8, 2021
b73c38e
BUG: Series[int].loc setitem with Series[int] results in Series[float…
simonjayhawkins Jun 8, 2021
017ff7c
PERF: is_bool_indexer (#41861)
jbrockmendel Jun 8, 2021
3f28015
Remove unused imports (#41860)
mzeitlin11 Jun 8, 2021
1e266f2
cleaned up ujson initialization (#41864)
WillAyd Jun 8, 2021
5abb02e
TST: add ASV check for groupby with uint64 col (#41859)
smithto1 Jun 8, 2021
9936902
DOC: remove versionadded/changed:: 0.24 (#41851)
simonjayhawkins Jun 8, 2021
e34338a
API: make construct_array_type non-optional (#41862)
jbrockmendel Jun 8, 2021
94756fe
REF: de-duplicate IntervalIndex setops (#41832)
jbrockmendel Jun 8, 2021
1ff3b9a
REF: de-duplicate Index._intersection + MultiIndex._intersection (#41…
jbrockmendel Jun 8, 2021
2c1e656
REF: more explicit dtypes in strings.accessor (#41727)
jbrockmendel Jun 9, 2021
63c20d2
PERF: nancorr_spearman (#41857)
mzeitlin11 Jun 9, 2021
4d10194
BUG: IntervalIndex.intersection (#41883)
jbrockmendel Jun 9, 2021
cecc3a1
whatsnew 1.3.0 (#41747)
rhshadrach Jun 9, 2021
6f953a8
PERF: clipping with scalar (#41869)
TLouf Jun 9, 2021
4a00fcc
BENCH: Remove unnecessary random seeds (#41889)
mroeschke Jun 9, 2021
bf72b70
REF: de-duplicate symmetric_difference, _union (#41833)
jbrockmendel Jun 9, 2021
3ca84fc
CLN/PERF: no need for kahan for int group_cumsum (#41874)
mzeitlin11 Jun 9, 2021
a0e79d2
REF: remove Index._convert_arr_indexer (#41884)
jbrockmendel Jun 9, 2021
8b9b1a1
Tst interval index NaN uniqueness (#41870)
r-toroxel Jun 9, 2021
db1db7e
TYP: use from __future__ import annotations more - batch 2 (#41894)
simonjayhawkins Jun 9, 2021
e34603d
TYP: use from __future__ import annotations more - batch 3 (#41895)
simonjayhawkins Jun 9, 2021
6d322e2
TYP: use from __future__ import annotations more - batch 4 (#41896)
simonjayhawkins Jun 9, 2021
52f04db
TYP: use from __future__ import annotations more - batch 5 (#41898)
simonjayhawkins Jun 9, 2021
7bac17b
batch 6 (#41900)
simonjayhawkins Jun 9, 2021
1f3e646
PERF: nancorr_spearman fastpath (#41885)
mzeitlin11 Jun 9, 2021
384f414
REF: split out grouped masked cummin/max algo (#41853)
mzeitlin11 Jun 9, 2021
ce3bac9
TYP: use from __future__ import annotations more (#41892)
simonjayhawkins Jun 9, 2021
eccfe90
Add py39 target in Black's configuration, bump Black to 21.5b2 (#38376)
fangchenli Jun 9, 2021
618cc0e
Revert "CI: Pin jinja2 to version lower than 3.0 (#41452)" (#41913)
lithomas1 Jun 10, 2021
daa23d6
TST/CLN: test_cov_corr (#41886)
mzeitlin11 Jun 10, 2021
499ef8c
REF: split out sorted_rank algo (#41910)
mzeitlin11 Jun 10, 2021
19da1ec
clean up positional-args deprecation warnings in whatsnew (#41908)
MarcoGorelli Jun 10, 2021
c8e23d2
PERF: reductions (#41911)
jbrockmendel Jun 10, 2021
1739199
Revert "PERF: reductions (#41911)" (#41923)
jorisvandenbossche Jun 10, 2021
90099df
REF: simplify indexes.base._maybe_cast_data_without_dtype (#41881)
jbrockmendel Jun 10, 2021
85aa89d
BUG: IntervalIndex is_monotonic, get_loc, get_indexer_for, contains w…
jbrockmendel Jun 10, 2021
44b9244
CLN: remove references of Travis (#41928)
fangchenli Jun 10, 2021
9ef6e9c
BUG: Categorical.fillna with non-category tuple (#41914)
jbrockmendel Jun 10, 2021
4d549cb
TYP: to_csv accepts IO[bytes] and fix FilePathOrBuffer (#41903)
twoertwein Jun 10, 2021
5940c9c
REF: de-duplicate IntervalIndex._intersection (#41929)
jbrockmendel Jun 10, 2021
ba86e19
BUG: inconsistent validation for get_indexer (#41918)
jbrockmendel Jun 10, 2021
da12db8
TYP/BUG: fix CI (#41938)
twoertwein Jun 11, 2021
d265e21
DOC: reverse rolling window (#38627) (#41842)
mdhsieh Jun 11, 2021
b4dcf3b
PERF: fix regression in reductions for boolean/integer data (#41924)
jorisvandenbossche Jun 11, 2021
969688a
DOC: fix header level in 1.3 release notes (#41925)
simonjayhawkins Jun 11, 2021
bf15751
DOC: add contributors to 1.3 release notes (#41922)
simonjayhawkins Jun 11, 2021
deaa922
TYP: datetimelike (#41830)
jbrockmendel Jun 11, 2021
58a6bc1
DOC: update the msgpack IO section to not refer to pyarrow.serialize …
jorisvandenbossche Jun 11, 2021
05552d3
[ArrayManager] Enable pytables IO by falling back to BlockManager (#4…
jorisvandenbossche Jun 11, 2021
3653ddd
DOC: fix link to pickle in msgpack section (#41950)
jorisvandenbossche Jun 11, 2021
f949788
CI: use pyarrow 0.17.1 instead of 0.17.0 for py37_macos build (#41948)
jorisvandenbossche Jun 11, 2021
8f70ed5
ENH: Add online operations for EWM.mean (#41888)
mroeschke Jun 12, 2021
9d2f1bf
CI: skip tests when only files in doc/web changes (github actions) (#…
ShaharNaveh Jun 12, 2021
a1412a0
DOC: Validator + converting array_like to array-like in docstrings (#…
kemcbride Jun 12, 2021
3765b20
RLS: 1.3.0rc0
Jun 12, 2021
cff206b
Start 1.4.0
simonjayhawkins Jun 12, 2021
8c6e865
DOC: start v1.4.0 release notes (#41926)
simonjayhawkins Jun 12, 2021
c791592
BLD: ignore multiple types of file in wheel (#41977)
fangchenli Jun 13, 2021
db44d4a
CI: activate azure pipelines/github actions on 1.3.x (#41966)
simonjayhawkins Jun 13, 2021
0b68d87
BLD: Update MANIFEST.in (#41981)
simonjayhawkins Jun 13, 2021
e3d57c9
TST: Un-xfail tests on numpy-dev (#41987)
lithomas1 Jun 14, 2021
6f18ef6
CI: mark online test slow (#41971)
mzeitlin11 Jun 15, 2021
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions .github/workflows/ci.yml
Original file line number Diff line number Diff line change
Expand Up @@ -7,6 +7,7 @@ on:
branches:
- master
- 1.2.x
- 1.3.x

env:
ENV_FILE: environment.yml
Expand Down
3 changes: 3 additions & 0 deletions .github/workflows/database.yml
Original file line number Diff line number Diff line change
Expand Up @@ -7,6 +7,9 @@ on:
branches:
- master
- 1.2.x
- 1.3.x
paths-ignore:
- "doc/**"

env:
PYTEST_WORKERS: "auto"
Expand Down
3 changes: 3 additions & 0 deletions .github/workflows/posix.yml
Original file line number Diff line number Diff line change
Expand Up @@ -7,6 +7,9 @@ on:
branches:
- master
- 1.2.x
- 1.3.x
paths-ignore:
- "doc/**"

env:
PYTEST_WORKERS: "auto"
Expand Down
2 changes: 2 additions & 0 deletions .github/workflows/python-dev.yml
Original file line number Diff line number Diff line change
Expand Up @@ -7,6 +7,8 @@ on:
pull_request:
branches:
- master
paths-ignore:
- "doc/**"

jobs:
build:
Expand Down
2 changes: 1 addition & 1 deletion .pre-commit-config.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -9,7 +9,7 @@ repos:
- id: absolufy-imports
files: ^pandas/
- repo: https://github.com/python/black
rev: 20.8b1
rev: 21.5b2
hooks:
- id: black
- repo: https://github.com/codespell-project/codespell
Expand Down
15 changes: 13 additions & 2 deletions MANIFEST.in
Original file line number Diff line number Diff line change
Expand Up @@ -17,18 +17,19 @@ global-exclude *.h5
global-exclude *.html
global-exclude *.json
global-exclude *.jsonl
global-exclude *.msgpack
global-exclude *.pdf
global-exclude *.pickle
global-exclude *.png
global-exclude *.pptx
global-exclude *.pyc
global-exclude *.pyd
global-exclude *.ods
global-exclude *.odt
global-exclude *.orc
global-exclude *.sas7bdat
global-exclude *.sav
global-exclude *.so
global-exclude *.xls
global-exclude *.xlsb
global-exclude *.xlsm
global-exclude *.xlsx
global-exclude *.xpt
Expand All @@ -39,6 +40,13 @@ global-exclude .DS_Store
global-exclude .git*
global-exclude \#*

global-exclude *.c
global-exclude *.cpp
global-exclude *.h

global-exclude *.py[ocd]
global-exclude *.pxi

# GH 39321
# csv_dir_path fixture checks the existence of the directory
# exclude the whole directory to avoid running related tests in sdist
Expand All @@ -47,3 +55,6 @@ prune pandas/tests/io/parser/data
include versioneer.py
include pandas/_version.py
include pandas/io/formats/templates/*.tpl

graft pandas/_libs/src
graft pandas/_libs/tslibs/src
8 changes: 4 additions & 4 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -10,13 +10,13 @@
[![DOI](https://zenodo.org/badge/DOI/10.5281/zenodo.3509134.svg)](https://doi.org/10.5281/zenodo.3509134)
[![Package Status](https://img.shields.io/pypi/status/pandas.svg)](https://pypi.org/project/pandas/)
[![License](https://img.shields.io/pypi/l/pandas.svg)](https://github.com/pandas-dev/pandas/blob/master/LICENSE)
[![Travis Build Status](https://travis-ci.org/pandas-dev/pandas.svg?branch=master)](https://travis-ci.org/pandas-dev/pandas)
[![Azure Build Status](https://dev.azure.com/pandas-dev/pandas/_apis/build/status/pandas-dev.pandas?branch=master)](https://dev.azure.com/pandas-dev/pandas/_build/latest?definitionId=1&branch=master)
[![Coverage](https://codecov.io/github/pandas-dev/pandas/coverage.svg?branch=master)](https://codecov.io/gh/pandas-dev/pandas)
[![Downloads](https://anaconda.org/conda-forge/pandas/badges/downloads.svg)](https://pandas.pydata.org)
[![Gitter](https://badges.gitter.im/Join%20Chat.svg)](https://gitter.im/pydata/pandas)
[![Powered by NumFOCUS](https://img.shields.io/badge/powered%20by-NumFOCUS-orange.svg?style=flat&colorA=E1523D&colorB=007D8A)](https://numfocus.org)
[![Code style: black](https://img.shields.io/badge/code%20style-black-000000.svg)](https://github.com/psf/black)
[![Imports: isort](https://img.shields.io/badge/%20imports-isort-%231674b1?style=flat&labelColor=ef8336)](https://pycqa.github.io/isort/)

## What is it?

Expand Down Expand Up @@ -101,8 +101,8 @@ pip install pandas

## Dependencies
- [NumPy - Adds support for large, multi-dimensional arrays, matrices and high-level mathematical functions to operate on these arrays](https://www.numpy.org)
- [python-dateutil - Provides powerful extensions to the standard datetime module](https://labix.org/python-dateutil)
- [pytz - Brings the Olson tz database into Python which allows accurate and cross platform timezone calculations](https://pythonhosted.org/pytz)
- [python-dateutil - Provides powerful extensions to the standard datetime module](https://dateutil.readthedocs.io/en/stable/index.html)
- [pytz - Brings the Olson tz database into Python which allows accurate and cross platform timezone calculations](https://github.com/stub42/pytz)

See the [full installation instructions](https://pandas.pydata.org/pandas-docs/stable/install.html#dependencies) for minimum supported versions of required, recommended and optional dependencies.

Expand All @@ -121,7 +121,7 @@ cloning the git repo), execute:
python setup.py install
```

or for installing in [development mode](https://pip.pypa.io/en/latest/reference/pip_install.html#editable-installs):
or for installing in [development mode](https://pip.pypa.io/en/latest/cli/pip_install/#install-editable):


```sh
Expand Down
23 changes: 10 additions & 13 deletions asv_bench/benchmarks/algorithms.py
Original file line number Diff line number Diff line change
Expand Up @@ -23,41 +23,38 @@ class Factorize:
"int",
"uint",
"float",
"string",
"object",
"datetime64[ns]",
"datetime64[ns, tz]",
"Int64",
"boolean",
"string_arrow",
"string[pyarrow]",
],
]
param_names = ["unique", "sort", "dtype"]

def setup(self, unique, sort, dtype):
N = 10 ** 5
string_index = tm.makeStringIndex(N)
try:
from pandas.core.arrays.string_arrow import ArrowStringDtype

string_arrow = pd.array(string_index, dtype=ArrowStringDtype())
except ImportError:
string_arrow = None

if dtype == "string_arrow" and not string_arrow:
raise NotImplementedError
string_arrow = None
if dtype == "string[pyarrow]":
try:
string_arrow = pd.array(string_index, dtype="string[pyarrow]")
except ImportError:
raise NotImplementedError

data = {
"int": pd.Int64Index(np.arange(N)),
"uint": pd.UInt64Index(np.arange(N)),
"float": pd.Float64Index(np.random.randn(N)),
"string": string_index,
"object": string_index,
"datetime64[ns]": pd.date_range("2011-01-01", freq="H", periods=N),
"datetime64[ns, tz]": pd.date_range(
"2011-01-01", freq="H", periods=N, tz="Asia/Tokyo"
),
"Int64": pd.array(np.arange(N), dtype="Int64"),
"boolean": pd.array(np.random.randint(0, 2, N), dtype="boolean"),
"string_arrow": string_arrow,
"string[pyarrow]": string_arrow,
}[dtype]
if not unique:
data = data.repeat(5)
Expand Down
16 changes: 3 additions & 13 deletions asv_bench/benchmarks/algos/isin.py
Original file line number Diff line number Diff line change
Expand Up @@ -25,8 +25,8 @@ class IsIn:
"category[object]",
"category[int]",
"str",
"string",
"arrow_string",
"string[python]",
"string[pyarrow]",
]
param_names = ["dtype"]

Expand All @@ -50,8 +50,6 @@ def setup(self, dtype):

elif dtype in ["category[object]", "category[int]"]:
# Note: sizes are different in this case than others
np.random.seed(1234)

n = 5 * 10 ** 5
sample_size = 100

Expand All @@ -62,9 +60,7 @@ def setup(self, dtype):
self.values = np.random.choice(arr, sample_size)
self.series = Series(arr).astype("category")

elif dtype in ["str", "string", "arrow_string"]:
from pandas.core.arrays.string_arrow import ArrowStringDtype # noqa: F401

elif dtype in ["str", "string[python]", "string[pyarrow]"]:
try:
self.series = Series(tm.makeStringIndex(N), dtype=dtype)
except ImportError:
Expand Down Expand Up @@ -101,7 +97,6 @@ class IsinAlmostFullWithRandomInt:
def setup(self, dtype, exponent, title):
M = 3 * 2 ** (exponent - 2)
# 0.77-the maximal share of occupied buckets
np.random.seed(42)
self.series = Series(np.random.randint(0, M, M)).astype(dtype)

values = np.random.randint(0, M, M).astype(dtype)
Expand Down Expand Up @@ -134,7 +129,6 @@ class IsinWithRandomFloat:
param_names = ["dtype", "size", "title"]

def setup(self, dtype, size, title):
np.random.seed(42)
self.values = np.random.rand(size)
self.series = Series(self.values).astype(dtype)
np.random.shuffle(self.values)
Expand Down Expand Up @@ -181,7 +175,6 @@ class IsinWithArange:

def setup(self, dtype, M, offset_factor):
offset = int(M * offset_factor)
np.random.seed(42)
tmp = Series(np.random.randint(offset, M + offset, 10 ** 6))
self.series = tmp.astype(dtype)
self.values = np.arange(M).astype(dtype)
Expand Down Expand Up @@ -292,10 +285,8 @@ def setup(self, dtype, MaxNumber, series_type):
raise NotImplementedError

if series_type == "random_hits":
np.random.seed(42)
array = np.random.randint(0, MaxNumber, N)
if series_type == "random_misses":
np.random.seed(42)
array = np.random.randint(0, MaxNumber, N) + MaxNumber
if series_type == "monotone_hits":
array = np.repeat(np.arange(MaxNumber), N // MaxNumber)
Expand Down Expand Up @@ -324,7 +315,6 @@ def setup(self, dtype, series_type):
raise NotImplementedError

if series_type == "random":
np.random.seed(42)
vals = np.random.randint(0, 10 * N, N)
if series_type == "monotone":
vals = np.arange(N)
Expand Down
1 change: 0 additions & 1 deletion asv_bench/benchmarks/frame_ctor.py
Original file line number Diff line number Diff line change
Expand Up @@ -67,7 +67,6 @@ class FromDictwithTimestamp:

def setup(self, offset):
N = 10 ** 3
np.random.seed(1234)
idx = date_range(Timestamp("1/1/1900"), freq=offset, periods=N)
df = DataFrame(np.random.randn(N, 10), index=idx)
self.d = df.to_dict()
Expand Down
8 changes: 5 additions & 3 deletions asv_bench/benchmarks/groupby.py
Original file line number Diff line number Diff line change
Expand Up @@ -393,7 +393,7 @@ class GroupByMethods:

param_names = ["dtype", "method", "application"]
params = [
["int", "float", "object", "datetime"],
["int", "float", "object", "datetime", "uint"],
[
"all",
"any",
Expand Down Expand Up @@ -442,6 +442,8 @@ def setup(self, dtype, method, application):
values = rng.take(np.random.randint(0, ngroups, size=size))
if dtype == "int":
key = np.random.randint(0, size, size=size)
elif dtype == "uint":
key = np.random.randint(0, size, size=size, dtype=dtype)
elif dtype == "float":
key = np.concatenate(
[np.random.random(ngroups) * 0.1, np.random.random(ngroups) * 10.0]
Expand Down Expand Up @@ -505,11 +507,11 @@ def time_frame_agg(self, dtype, method):
self.df.groupby("key").agg(method)


class CumminMax:
class Cumulative:
param_names = ["dtype", "method"]
params = [
["float64", "int64", "Float64", "Int64"],
["cummin", "cummax"],
["cummin", "cummax", "cumsum"],
]

def setup(self, dtype, method):
Expand Down
1 change: 0 additions & 1 deletion asv_bench/benchmarks/hash_functions.py
Original file line number Diff line number Diff line change
Expand Up @@ -67,7 +67,6 @@ class NumericSeriesIndexingShuffled:

def setup(self, index, N):
vals = np.array(list(range(55)) + [54] + list(range(55, N - 1)))
np.random.seed(42)
np.random.shuffle(vals)
indices = index(vals)
self.data = pd.Series(np.arange(N), index=indices)
Expand Down
3 changes: 0 additions & 3 deletions asv_bench/benchmarks/indexing.py
Original file line number Diff line number Diff line change
Expand Up @@ -368,17 +368,14 @@ def setup(self):
self.df = DataFrame(index=range(self.N))

def time_insert(self):
np.random.seed(1234)
for i in range(100):
self.df.insert(0, i, np.random.randn(self.N), allow_duplicates=True)

def time_assign_with_setitem(self):
np.random.seed(1234)
for i in range(100):
self.df[i] = np.random.randn(self.N)

def time_assign_list_like_with_setitem(self):
np.random.seed(1234)
self.df[list(range(100))] = np.random.randn(self.N, 100)

def time_assign_list_of_columns_concat(self):
Expand Down
1 change: 0 additions & 1 deletion asv_bench/benchmarks/series_methods.py
Original file line number Diff line number Diff line change
Expand Up @@ -145,7 +145,6 @@ class Mode:
param_names = ["N", "dtype"]

def setup(self, N, dtype):
np.random.seed(42)
self.s = Series(np.random.randint(0, N, size=10 * N)).astype(dtype)

def time_mode(self, N, dtype):
Expand Down
4 changes: 1 addition & 3 deletions asv_bench/benchmarks/strings.py
Original file line number Diff line number Diff line change
Expand Up @@ -12,12 +12,10 @@


class Dtypes:
params = ["str", "string", "arrow_string"]
params = ["str", "string[python]", "string[pyarrow]"]
param_names = ["dtype"]

def setup(self, dtype):
from pandas.core.arrays.string_arrow import ArrowStringDtype # noqa: F401

try:
self.s = Series(tm.makeStringIndex(10 ** 5), dtype=dtype)
except ImportError:
Expand Down
11 changes: 9 additions & 2 deletions azure-pipelines.yml
Original file line number Diff line number Diff line change
@@ -1,11 +1,18 @@
# Adapted from https://github.com/numba/numba/blob/master/azure-pipelines.yml
trigger:
- master
- 1.2.x
branches:
include:
- master
- 1.2.x
- 1.3.x
paths:
exclude:
- 'doc/*'

pr:
- master
- 1.2.x
- 1.3.x

variables:
PYTEST_WORKERS: auto
Expand Down
4 changes: 4 additions & 0 deletions ci/code_checks.sh
Original file line number Diff line number Diff line change
Expand Up @@ -77,6 +77,10 @@ if [[ -z "$CHECK" || "$CHECK" == "patterns" ]]; then
invgrep -R --include="*.rst" -E "[a-zA-Z0-9]\`\`?[a-zA-Z0-9]" doc/source/
RET=$(($RET + $?)) ; echo $MSG "DONE"

MSG='Check for unnecessary random seeds in asv benchmarks' ; echo $MSG
invgrep -R --exclude pandas_vb_common.py -E 'np.random.seed' asv_bench/benchmarks/
RET=$(($RET + $?)) ; echo $MSG "DONE"

fi

### CODE ###
Expand Down
2 changes: 1 addition & 1 deletion ci/deps/azure-macos-37.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -22,7 +22,7 @@ dependencies:
- numexpr
- numpy=1.17.3
- openpyxl
- pyarrow=0.17.0
- pyarrow=0.17
- pytables
- python-dateutil==2.7.3
- pytz
Expand Down
Binary file modified doc/source/_static/ci.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading