Skip to content

Commit d0b6962

Browse files
committed
Merge branch 'main' of https://github.com/pandas-dev/pandas into numeric_only_window
2 parents 93b15c5 + aa305f3 commit d0b6962

File tree

137 files changed

+1780
-928
lines changed

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

137 files changed

+1780
-928
lines changed

.circleci/config.yml

+1-1
Original file line numberDiff line numberDiff line change
@@ -13,7 +13,7 @@ jobs:
1313
PANDAS_CI: "1"
1414
steps:
1515
- checkout
16-
- run: ci/setup_env.sh
16+
- run: .circleci/setup_env.sh
1717
- run: PATH=$HOME/miniconda3/envs/pandas-dev/bin:$HOME/miniconda3/condabin:$PATH ci/run_tests.sh
1818

1919
workflows:
File renamed without changes.

.github/PULL_REQUEST_TEMPLATE.md

+1
Original file line numberDiff line numberDiff line change
@@ -1,4 +1,5 @@
11
- [ ] closes #xxxx (Replace xxxx with the Github issue number)
22
- [ ] [Tests added and passed](https://pandas.pydata.org/pandas-docs/dev/development/contributing_codebase.html#writing-tests) if fixing a bug or adding a new feature
33
- [ ] All [code checks passed](https://pandas.pydata.org/pandas-docs/dev/development/contributing_codebase.html#pre-commit).
4+
- [ ] Added [type annotations](https://pandas.pydata.org/pandas-docs/dev/development/contributing_codebase.html#type-hints) to new arguments/methods/functions.
45
- [ ] Added an entry in the latest `doc/source/whatsnew/vX.X.X.rst` file if fixing a bug or adding a new feature.
+27
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,27 @@
1+
name: Set up Conda environment
2+
inputs:
3+
environment-file:
4+
description: Conda environment file to use.
5+
default: environment.yml
6+
pyarrow-version:
7+
description: If set, overrides the PyArrow version in the Conda environment to the given string.
8+
required: false
9+
runs:
10+
using: composite
11+
steps:
12+
- name: Set Arrow version in ${{ inputs.environment-file }} to ${{ inputs.pyarrow-version }}
13+
run: |
14+
grep -q ' - pyarrow' ${{ inputs.environment-file }}
15+
sed -i"" -e "s/ - pyarrow/ - pyarrow=${{ inputs.pyarrow-version }}/" ${{ inputs.environment-file }}
16+
cat ${{ inputs.environment-file }}
17+
shell: bash
18+
if: ${{ inputs.pyarrow-version }}
19+
20+
- name: Install ${{ inputs.environment-file }}
21+
uses: conda-incubator/setup-miniconda@v2
22+
with:
23+
environment-file: ${{ inputs.environment-file }}
24+
channel-priority: ${{ runner.os == 'macOS' && 'flexible' || 'strict' }}
25+
channels: conda-forge
26+
mamba-version: "0.23"
27+
use-mamba: true

.github/actions/setup/action.yml

-12
This file was deleted.

.github/workflows/32-bit-linux.yml

+6
Original file line numberDiff line numberDiff line change
@@ -23,9 +23,15 @@ jobs:
2323

2424
- name: Run 32-bit manylinux2014 Docker Build / Tests
2525
run: |
26+
# Without this (line 34), versioneer will not be able to determine the pandas version.
27+
# This is because of a security update to git that blocks it from reading the config folder if
28+
# it is not owned by the current user. We hit this since the "mounted" folder is not hit by the
29+
# Docker container.
30+
# xref https://github.com/pypa/manylinux/issues/1309
2631
docker pull quay.io/pypa/manylinux2014_i686
2732
docker run --platform linux/386 -v $(pwd):/pandas quay.io/pypa/manylinux2014_i686 \
2833
/bin/bash -xc "cd pandas && \
34+
git config --global --add safe.directory /pandas && \
2935
/opt/python/cp38-cp38/bin/python -m venv ~/virtualenvs/pandas-dev && \
3036
. ~/virtualenvs/pandas-dev/bin/activate && \
3137
python -m pip install --no-deps -U pip wheel 'setuptools<60.0.0' && \

.github/workflows/docbuild-and-upload.yml

+14-11
Original file line numberDiff line numberDiff line change
@@ -24,43 +24,46 @@ jobs:
2424
group: ${{ github.event_name == 'push' && github.run_number || github.ref }}-web-docs
2525
cancel-in-progress: true
2626

27+
defaults:
28+
run:
29+
shell: bash -el {0}
30+
2731
steps:
2832
- name: Checkout
2933
uses: actions/checkout@v3
3034
with:
3135
fetch-depth: 0
3236

33-
- name: Set up pandas
34-
uses: ./.github/actions/setup
37+
- name: Set up Conda
38+
uses: ./.github/actions/setup-conda
39+
40+
- name: Build Pandas
41+
uses: ./.github/actions/build_pandas
3542

3643
- name: Build website
37-
run: |
38-
source activate pandas-dev
39-
python web/pandas_web.py web/pandas --target-path=web/build
44+
run: python web/pandas_web.py web/pandas --target-path=web/build
4045

4146
- name: Build documentation
42-
run: |
43-
source activate pandas-dev
44-
doc/make.py --warnings-are-errors
47+
run: doc/make.py --warnings-are-errors
4548

4649
- name: Install ssh key
4750
run: |
4851
mkdir -m 700 -p ~/.ssh
4952
echo "${{ secrets.server_ssh_key }}" > ~/.ssh/id_rsa
5053
chmod 600 ~/.ssh/id_rsa
5154
echo "${{ secrets.server_ip }} ecdsa-sha2-nistp256 AAAAE2VjZHNhLXNoYTItbmlzdHAyNTYAAAAIbmlzdHAyNTYAAABBBE1Kkopomm7FHG5enATf7SgnpICZ4W2bw+Ho+afqin+w7sMcrsa0je7sbztFAV8YchDkiBKnWTG4cRT+KZgZCaY=" > ~/.ssh/known_hosts
52-
if: ${{github.event_name == 'push' && github.ref == 'refs/heads/main'}}
55+
if: github.event_name == 'push' && github.ref == 'refs/heads/main'
5356

5457
- name: Copy cheatsheets into site directory
5558
run: cp doc/cheatsheet/Pandas_Cheat_Sheet* web/build/
5659

5760
- name: Upload web
5861
run: rsync -az --delete --exclude='pandas-docs' --exclude='docs' web/build/ docs@${{ secrets.server_ip }}:/usr/share/nginx/pandas
59-
if: ${{github.event_name == 'push' && github.ref == 'refs/heads/main'}}
62+
if: github.event_name == 'push' && github.ref == 'refs/heads/main'
6063

6164
- name: Upload dev docs
6265
run: rsync -az --delete doc/build/html/ docs@${{ secrets.server_ip }}:/usr/share/nginx/pandas/pandas-docs/dev
63-
if: ${{github.event_name == 'push' && github.ref == 'refs/heads/main'}}
66+
if: github.event_name == 'push' && github.ref == 'refs/heads/main'
6467

6568
- name: Move docs into site directory
6669
run: mv doc/build/html web/build/docs

.github/workflows/macos-windows.yml

+3-17
Original file line numberDiff line numberDiff line change
@@ -43,32 +43,18 @@ jobs:
4343
with:
4444
fetch-depth: 0
4545

46-
- name: Install Dependencies
47-
uses: conda-incubator/setup-[email protected]
46+
- name: Set up Conda
47+
uses: ./.github/actions/setup-conda
4848
with:
49-
mamba-version: "*"
50-
channels: conda-forge
51-
activate-environment: pandas-dev
52-
channel-priority: ${{ matrix.os == 'macos-latest' && 'flexible' || 'strict' }}
5349
environment-file: ci/deps/${{ matrix.env_file }}
54-
use-only-tar-bz2: true
55-
56-
# ImportError: 2): Library not loaded: @rpath/libssl.1.1.dylib
57-
# Referenced from: /Users/runner/miniconda3/envs/pandas-dev/lib/libthrift.0.13.0.dylib
58-
# Reason: image not found
59-
- name: Upgrade pyarrow on MacOS
60-
run: conda install -n pandas-dev -c conda-forge --no-update-deps pyarrow=6
61-
if: ${{ matrix.os == 'macos-latest' }}
50+
pyarrow-version: ${{ matrix.os == 'macos-latest' && '6' || '' }}
6251

6352
- name: Build Pandas
6453
uses: ./.github/actions/build_pandas
6554

6655
- name: Test
6756
run: ci/run_tests.sh
6857

69-
- name: Build Version
70-
run: conda list
71-
7258
- name: Publish test results
7359
uses: actions/upload-artifact@v3
7460
with:

.pre-commit-config.yaml

+3-3
Original file line numberDiff line numberDiff line change
@@ -11,7 +11,7 @@ repos:
1111
- id: absolufy-imports
1212
files: ^pandas/
1313
- repo: https://github.com/jendrikseipp/vulture
14-
rev: 'v2.3'
14+
rev: 'v2.4'
1515
hooks:
1616
- id: vulture
1717
entry: python scripts/run_vulture.py
@@ -60,7 +60,7 @@ repos:
6060
hooks:
6161
- id: isort
6262
- repo: https://github.com/asottile/pyupgrade
63-
rev: v2.32.0
63+
rev: v2.32.1
6464
hooks:
6565
- id: pyupgrade
6666
args: [--py38-plus]
@@ -75,7 +75,7 @@ repos:
7575
types: [text] # overwrite types: [rst]
7676
types_or: [python, rst]
7777
- repo: https://github.com/sphinx-contrib/sphinx-lint
78-
rev: v0.4.1
78+
rev: v0.6
7979
hooks:
8080
- id: sphinx-lint
8181
- repo: https://github.com/asottile/yesqa

asv_bench/asv.conf.json

+1-1
Original file line numberDiff line numberDiff line change
@@ -42,7 +42,7 @@
4242
// followed by the pip installed packages).
4343
"matrix": {
4444
"numpy": [],
45-
"Cython": ["0.29.24"],
45+
"Cython": ["0.29.30"],
4646
"matplotlib": [],
4747
"sqlalchemy": [],
4848
"scipy": [],

asv_bench/benchmarks/groupby.py

+13
Original file line numberDiff line numberDiff line change
@@ -7,6 +7,7 @@
77
from pandas import (
88
Categorical,
99
DataFrame,
10+
Index,
1011
MultiIndex,
1112
Series,
1213
Timestamp,
@@ -111,6 +112,18 @@ def time_copy_overhead_single_col(self, factor):
111112
self.df.groupby("key").apply(self.df_copy_function)
112113

113114

115+
class ApplyNonUniqueUnsortedIndex:
116+
def setup(self):
117+
# GH 46527
118+
# unsorted and non-unique index
119+
idx = np.arange(100)[::-1]
120+
idx = Index(np.repeat(idx, 200), name="key")
121+
self.df = DataFrame(np.random.randn(len(idx), 10), index=idx)
122+
123+
def time_groupby_apply_non_unique_unsorted_index(self):
124+
self.df.groupby("key", group_keys=False).apply(lambda x: x)
125+
126+
114127
class Groups:
115128

116129
param_names = ["key"]

asv_bench/benchmarks/io/excel.py

+11
Original file line numberDiff line numberDiff line change
@@ -86,4 +86,15 @@ def time_read_excel(self, engine):
8686
read_excel(fname, engine=engine)
8787

8888

89+
class ReadExcelNRows(ReadExcel):
90+
def time_read_excel(self, engine):
91+
if engine == "xlrd":
92+
fname = self.fname_excel_xls
93+
elif engine == "odf":
94+
fname = self.fname_odf
95+
else:
96+
fname = self.fname_excel
97+
read_excel(fname, engine=engine, nrows=10)
98+
99+
89100
from ..pandas_vb_common import setup # noqa: F401 isort:skip

asv_bench/benchmarks/reindex.py

+8
Original file line numberDiff line numberDiff line change
@@ -30,6 +30,10 @@ def setup(self):
3030
self.s_subset = self.s[::2]
3131
self.s_subset_no_cache = self.s[::2].copy()
3232

33+
mi = MultiIndex.from_product([rng, range(100)])
34+
self.s2 = Series(np.random.randn(len(mi)), index=mi)
35+
self.s2_subset = self.s2[::2].copy()
36+
3337
def time_reindex_dates(self):
3438
self.df.reindex(self.rng_subset)
3539

@@ -44,6 +48,10 @@ def time_reindex_multiindex_no_cache(self):
4448
# Copy to avoid MultiIndex._values getting cached
4549
self.s.reindex(self.s_subset_no_cache.index.copy())
4650

51+
def time_reindex_multiindex_no_cache_dates(self):
52+
# Copy to avoid MultiIndex._values getting cached
53+
self.s2_subset.reindex(self.s2.index.copy())
54+
4755

4856
class ReindexMethod:
4957

ci/deps/actions-38-minimum_versions.yaml

+1-1
Original file line numberDiff line numberDiff line change
@@ -17,7 +17,7 @@ dependencies:
1717

1818
# required dependencies
1919
- python-dateutil=2.8.1
20-
- numpy=1.18.5
20+
- numpy=1.19.5
2121
- pytz=2020.1
2222

2323
# optional dependencies

doc/source/development/contributing_documentation.rst

+5-1
Original file line numberDiff line numberDiff line change
@@ -12,7 +12,11 @@ you don't have to be an expert on pandas to do so! In fact,
1212
there are sections of the docs that are worse off after being written by
1313
experts. If something in the docs doesn't make sense to you, updating the
1414
relevant section after you figure it out is a great way to ensure it will help
15-
the next person.
15+
the next person. Please visit the `issues page <https://github.com/pandas-dev/pandas/issues?page=1&q=is%3Aopen+sort%3Aupdated-desc+label%3ADocs>`__
16+
for a full list of issues that are currently open regarding the
17+
Pandas documentation.
18+
19+
1620

1721
.. contents:: Documentation:
1822
:local:

doc/source/getting_started/install.rst

+1-1
Original file line numberDiff line numberDiff line change
@@ -235,7 +235,7 @@ Dependencies
235235
================================================================ ==========================
236236
Package Minimum supported version
237237
================================================================ ==========================
238-
`NumPy <https://numpy.org>`__ 1.18.5
238+
`NumPy <https://numpy.org>`__ 1.19.5
239239
`python-dateutil <https://dateutil.readthedocs.io/en/stable/>`__ 2.8.1
240240
`pytz <https://pypi.org/project/pytz/>`__ 2020.1
241241
================================================================ ==========================

doc/source/getting_started/intro_tutorials/06_calculate_statistics.rst

+2-2
Original file line numberDiff line numberDiff line change
@@ -143,8 +143,8 @@ returned.
143143

144144
Calculating a given statistic (e.g. ``mean`` age) *for each category in
145145
a column* (e.g. male/female in the ``Sex`` column) is a common pattern.
146-
The ``groupby`` method is used to support this type of operations. More
147-
general, this fits in the more general ``split-apply-combine`` pattern:
146+
The ``groupby`` method is used to support this type of operations. This
147+
fits in the more general ``split-apply-combine`` pattern:
148148

149149
- **Split** the data into groups
150150
- **Apply** a function to each group independently

doc/source/reference/testing.rst

+1
Original file line numberDiff line numberDiff line change
@@ -42,6 +42,7 @@ Exceptions and warnings
4242
errors.ParserWarning
4343
errors.PerformanceWarning
4444
errors.SettingWithCopyError
45+
errors.SettingWithCopyWarning
4546
errors.SpecificationError
4647
errors.UnsortedIndexError
4748
errors.UnsupportedFunctionCall

doc/source/whatsnew/v1.4.3.rst

+4
Original file line numberDiff line numberDiff line change
@@ -15,13 +15,16 @@ including other versions of pandas.
1515
Fixed regressions
1616
~~~~~~~~~~~~~~~~~
1717
- Fixed regression in :meth:`DataFrame.replace` when the replacement value was explicitly ``None`` when passed in a dictionary to ``to_replace`` also casting other columns to object dtype even when there were no values to replace (:issue:`46634`)
18+
- Fixed regression when setting values with :meth:`DataFrame.loc` updating :class:`RangeIndex` when index was set as new column and column was updated afterwards (:issue:`47128`)
1819
- Fixed regression in :meth:`DataFrame.nsmallest` led to wrong results when ``np.nan`` in the sorting column (:issue:`46589`)
1920
- Fixed regression in :func:`read_fwf` raising ``ValueError`` when ``widths`` was specified with ``usecols`` (:issue:`46580`)
21+
- Fixed regression in :func:`concat` not sorting columns for mixed column names (:issue:`47127`)
2022
- Fixed regression in :meth:`.Groupby.transform` and :meth:`.Groupby.agg` failing with ``engine="numba"`` when the index was a :class:`MultiIndex` (:issue:`46867`)
2123
- Fixed regression is :meth:`.Styler.to_latex` and :meth:`.Styler.to_html` where ``buf`` failed in combination with ``encoding`` (:issue:`47053`)
2224
- Fixed regression in :func:`read_csv` with ``index_col=False`` identifying first row as index names when ``header=None`` (:issue:`46955`)
2325
- Fixed regression in :meth:`.DataFrameGroupBy.agg` when used with list-likes or dict-likes and ``axis=1`` that would give incorrect results; now raises ``NotImplementedError`` (:issue:`46995`)
2426
- Fixed regression in :meth:`DataFrame.resample` and :meth:`DataFrame.rolling` when used with list-likes or dict-likes and ``axis=1`` that would raise an unintuitive error message; now raises ``NotImplementedError`` (:issue:`46904`)
27+
- Fixed regression in :func:`read_excel` returning ints as floats on certain input sheets (:issue:`46988`)
2528
- Fixed regression in :meth:`DataFrame.shift` when ``axis`` is ``columns`` and ``fill_value`` is absent, ``freq`` is ignored (:issue:`47039`)
2629

2730
.. ---------------------------------------------------------------------------
@@ -30,6 +33,7 @@ Fixed regressions
3033

3134
Bug fixes
3235
~~~~~~~~~
36+
- Bug in :meth:`pd.eval`, :meth:`DataFrame.eval` and :meth:`DataFrame.query` where passing empty ``local_dict`` or ``global_dict`` was treated as passing ``None`` (:issue:`47084`)
3337
- Most I/O methods do no longer suppress ``OSError`` and ``ValueError`` when closing file handles (:issue:`47136`)
3438
-
3539

0 commit comments

Comments
 (0)