Skip to content

Commit 4fc9217

Browse files
Merge remote-tracking branch 'upstream/master' into bisect
2 parents e16418a + dbdc55c commit 4fc9217

File tree

229 files changed

+3815
-2226
lines changed

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

229 files changed

+3815
-2226
lines changed

Dockerfile

+8-7
Original file line numberDiff line numberDiff line change
@@ -1,4 +1,4 @@
1-
FROM continuumio/miniconda3
1+
FROM quay.io/condaforge/miniforge3
22

33
# if you forked pandas, you can pass in your own GitHub username to use your fork
44
# i.e. gh_username=myname
@@ -15,10 +15,6 @@ RUN apt-get update \
1515
# Verify git, process tools, lsb-release (common in install instructions for CLIs) installed
1616
&& apt-get -y install git iproute2 procps iproute2 lsb-release \
1717
#
18-
# Install C compilers (gcc not enough, so just went with build-essential which admittedly might be overkill),
19-
# needed to build pandas C extensions
20-
&& apt-get -y install build-essential \
21-
#
2218
# cleanup
2319
&& apt-get autoremove -y \
2420
&& apt-get clean -y \
@@ -39,9 +35,14 @@ RUN mkdir "$pandas_home" \
3935
# we just update the base/root one from the 'environment.yml' file instead of creating a new one.
4036
#
4137
# Set up environment
42-
RUN conda env update -n base -f "$pandas_home/environment.yml"
38+
RUN conda install -y mamba
39+
RUN mamba env update -n base -f "$pandas_home/environment.yml"
4340

4441
# Build C extensions and pandas
45-
RUN cd "$pandas_home" \
42+
SHELL ["/bin/bash", "-c"]
43+
RUN . /opt/conda/etc/profile.d/conda.sh \
44+
&& conda activate base \
45+
&& cd "$pandas_home" \
46+
&& export \
4647
&& python setup.py build_ext -j 4 \
4748
&& python -m pip install -e .

README.md

+19-19
Original file line numberDiff line numberDiff line change
@@ -63,24 +63,24 @@ Here are just a few of the things that pandas does well:
6363
date shifting and lagging
6464

6565

66-
[missing-data]: https://pandas.pydata.org/pandas-docs/stable/missing_data.html#working-with-missing-data
67-
[insertion-deletion]: https://pandas.pydata.org/pandas-docs/stable/dsintro.html#column-selection-addition-deletion
68-
[alignment]: https://pandas.pydata.org/pandas-docs/stable/dsintro.html?highlight=alignment#intro-to-data-structures
69-
[groupby]: https://pandas.pydata.org/pandas-docs/stable/groupby.html#group-by-split-apply-combine
70-
[conversion]: https://pandas.pydata.org/pandas-docs/stable/dsintro.html#dataframe
71-
[slicing]: https://pandas.pydata.org/pandas-docs/stable/indexing.html#slicing-ranges
72-
[fancy-indexing]: https://pandas.pydata.org/pandas-docs/stable/indexing.html#advanced-indexing-with-ix
73-
[subsetting]: https://pandas.pydata.org/pandas-docs/stable/indexing.html#boolean-indexing
74-
[merging]: https://pandas.pydata.org/pandas-docs/stable/merging.html#database-style-dataframe-joining-merging
75-
[joining]: https://pandas.pydata.org/pandas-docs/stable/merging.html#joining-on-index
76-
[reshape]: https://pandas.pydata.org/pandas-docs/stable/reshaping.html#reshaping-and-pivot-tables
77-
[pivot-table]: https://pandas.pydata.org/pandas-docs/stable/reshaping.html#pivot-tables-and-cross-tabulations
78-
[mi]: https://pandas.pydata.org/pandas-docs/stable/indexing.html#hierarchical-indexing-multiindex
79-
[flat-files]: https://pandas.pydata.org/pandas-docs/stable/io.html#csv-text-files
80-
[excel]: https://pandas.pydata.org/pandas-docs/stable/io.html#excel-files
81-
[db]: https://pandas.pydata.org/pandas-docs/stable/io.html#sql-queries
82-
[hdfstore]: https://pandas.pydata.org/pandas-docs/stable/io.html#hdf5-pytables
83-
[timeseries]: https://pandas.pydata.org/pandas-docs/stable/timeseries.html#time-series-date-functionality
66+
[missing-data]: https://pandas.pydata.org/pandas-docs/stable/user_guide/missing_data.html
67+
[insertion-deletion]: https://pandas.pydata.org/pandas-docs/stable/user_guide/dsintro.html#column-selection-addition-deletion
68+
[alignment]: https://pandas.pydata.org/pandas-docs/stable/user_guide/dsintro.html?highlight=alignment#intro-to-data-structures
69+
[groupby]: https://pandas.pydata.org/pandas-docs/stable/user_guide/groupby.html#group-by-split-apply-combine
70+
[conversion]: https://pandas.pydata.org/pandas-docs/stable/user_guide/dsintro.html#dataframe
71+
[slicing]: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#slicing-ranges
72+
[fancy-indexing]: https://pandas.pydata.org/pandas-docs/stable/user_guide/advanced.html#advanced
73+
[subsetting]: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#boolean-indexing
74+
[merging]: https://pandas.pydata.org/pandas-docs/stable/user_guide/merging.html#database-style-dataframe-or-named-series-joining-merging
75+
[joining]: https://pandas.pydata.org/pandas-docs/stable/user_guide/merging.html#joining-on-index
76+
[reshape]: https://pandas.pydata.org/pandas-docs/stable/user_guide/reshaping.html
77+
[pivot-table]: https://pandas.pydata.org/pandas-docs/stable/user_guide/reshaping.html
78+
[mi]: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#hierarchical-indexing-multiindex
79+
[flat-files]: https://pandas.pydata.org/pandas-docs/stable/user_guide/io.html#csv-text-files
80+
[excel]: https://pandas.pydata.org/pandas-docs/stable/user_guide/io.html#excel-files
81+
[db]: https://pandas.pydata.org/pandas-docs/stable/user_guide/io.html#sql-queries
82+
[hdfstore]: https://pandas.pydata.org/pandas-docs/stable/user_guide/io.html#hdf5-pytables
83+
[timeseries]: https://pandas.pydata.org/pandas-docs/stable/user_guide/timeseries.html#time-series-date-functionality
8484

8585
## Where to get it
8686
The source code is currently hosted on GitHub at:
@@ -154,7 +154,7 @@ For usage questions, the best place to go to is [StackOverflow](https://stackove
154154
Further, general questions and discussions can also take place on the [pydata mailing list](https://groups.google.com/forum/?fromgroups#!forum/pydata).
155155

156156
## Discussion and Development
157-
Most development discussions take place on github in this repo. Further, the [pandas-dev mailing list](https://mail.python.org/mailman/listinfo/pandas-dev) can also be used for specialized discussions or design issues, and a [Gitter channel](https://gitter.im/pydata/pandas) is available for quick development related questions.
157+
Most development discussions take place on GitHub in this repo. Further, the [pandas-dev mailing list](https://mail.python.org/mailman/listinfo/pandas-dev) can also be used for specialized discussions or design issues, and a [Gitter channel](https://gitter.im/pydata/pandas) is available for quick development related questions.
158158

159159
## Contributing to pandas [![Open Source Helpers](https://www.codetriage.com/pandas-dev/pandas/badges/users.svg)](https://www.codetriage.com/pandas-dev/pandas)
160160

asv_bench/benchmarks/indexing.py

+8
Original file line numberDiff line numberDiff line change
@@ -358,6 +358,14 @@ def time_assign_with_setitem(self):
358358
for i in range(100):
359359
self.df[i] = np.random.randn(self.N)
360360

361+
def time_assign_list_like_with_setitem(self):
362+
np.random.seed(1234)
363+
self.df[list(range(100))] = np.random.randn(self.N, 100)
364+
365+
def time_assign_list_of_columns_concat(self):
366+
df = DataFrame(np.random.randn(self.N, 100))
367+
concat([self.df, df], axis=1)
368+
361369

362370
class ChainIndexing:
363371

asv_bench/benchmarks/rolling.py

+14
Original file line numberDiff line numberDiff line change
@@ -225,6 +225,20 @@ def time_rolling_offset(self, method):
225225
getattr(self.groupby_roll_offset, method)()
226226

227227

228+
class GroupbyLargeGroups:
229+
# https://github.com/pandas-dev/pandas/issues/38038
230+
# specific example where the rolling operation on a larger dataframe
231+
# is relatively cheap (few but large groups), but creation of
232+
# MultiIndex of result can be expensive
233+
234+
def setup(self):
235+
N = 100000
236+
self.df = pd.DataFrame({"A": [1, 2] * int(N / 2), "B": np.random.randn(N)})
237+
238+
def time_rolling_multiindex_creation(self):
239+
self.df.groupby("A").rolling(3).mean()
240+
241+
228242
class GroupbyEWM:
229243

230244
params = ["cython", "numba"]

ci/deps/azure-38-numpydev.yaml

+1-1
Original file line numberDiff line numberDiff line change
@@ -12,7 +12,7 @@ dependencies:
1212

1313
# pandas dependencies
1414
- pytz
15-
- pip
15+
- pip=20.2
1616
- pip:
1717
- cython==0.29.21 # GH#34014
1818
- "git+git://github.com/dateutil/dateutil.git"

ci/setup_env.sh

+6
Original file line numberDiff line numberDiff line change
@@ -108,6 +108,12 @@ fi
108108
echo "activate pandas-dev"
109109
source activate pandas-dev
110110

111+
# Explicitly set an environment variable indicating that this is pandas' CI environment.
112+
#
113+
# This allows us to enable things like -Werror that shouldn't be activated in
114+
# downstream CI jobs that may also build pandas from source.
115+
export PANDAS_CI=1
116+
111117
echo
112118
echo "remove any installed pandas package"
113119
echo "w/o removing anything else"

doc/source/development/contributing.rst

+3-2
Original file line numberDiff line numberDiff line change
@@ -147,8 +147,9 @@ Creating a development environment
147147

148148
To test out code changes, you'll need to build pandas from source, which
149149
requires a C/C++ compiler and Python environment. If you're making documentation
150-
changes, you can skip to :ref:`contributing.documentation` but you won't be able
151-
to build the documentation locally before pushing your changes.
150+
changes, you can skip to :ref:`contributing.documentation` but if you skip
151+
creating the development environment you won't be able to build the documentation
152+
locally before pushing your changes.
152153

153154
Using a Docker container
154155
~~~~~~~~~~~~~~~~~~~~~~~~

doc/source/reference/series.rst

-1
Original file line numberDiff line numberDiff line change
@@ -252,7 +252,6 @@ Combining / comparing / joining / merging
252252

253253
Series.append
254254
Series.compare
255-
Series.replace
256255
Series.update
257256

258257
Time Series-related

doc/source/whatsnew/v1.1.5.rst

+13
Original file line numberDiff line numberDiff line change
@@ -19,9 +19,14 @@ Fixed regressions
1919
- Fixed regression in :meth:`DataFrame.loc` and :meth:`Series.loc` for ``__setitem__`` when one-dimensional tuple was given to select from :class:`MultiIndex` (:issue:`37711`)
2020
- Fixed regression in inplace operations on :class:`Series` with ``ExtensionDtype`` with NumPy dtyped operand (:issue:`37910`)
2121
- Fixed regression in metadata propagation for ``groupby`` iterator (:issue:`37343`)
22+
- Fixed regression in :class:`MultiIndex` constructed from a :class:`DatetimeIndex` not retaining frequency (:issue:`35563`)
23+
- Fixed regression in :meth:`DataFrame.unstack` with columns with integer dtype (:issue:`37115`)
2224
- Fixed regression in indexing on a :class:`Series` with ``CategoricalDtype`` after unpickling (:issue:`37631`)
25+
- Fixed regression in :meth:`DataFrame.groupby` aggregation with out-of-bounds datetime objects in an object-dtype column (:issue:`36003`)
2326
- Fixed regression in ``df.groupby(..).rolling(..)`` with the resulting :class:`MultiIndex` when grouping by a label that is in the index (:issue:`37641`)
2427
- Fixed regression in :meth:`DataFrame.fillna` not filling ``NaN`` after other operations such as :meth:`DataFrame.pivot` (:issue:`36495`).
28+
- Fixed performance regression in ``df.groupby(..).rolling(..)`` (:issue:`38038`)
29+
- Fixed regression in :meth:`MultiIndex.intersection` returning duplicates when at least one of the indexes had duplicates (:issue:`36915`)
2530

2631
.. ---------------------------------------------------------------------------
2732
@@ -33,6 +38,14 @@ Bug fixes
3338

3439
.. ---------------------------------------------------------------------------
3540
41+
.. _whatsnew_115.other:
42+
43+
Other
44+
~~~~~
45+
- Only set ``-Werror`` as a compiler flag in the CI jobs (:issue:`33315`, :issue:`33314`)
46+
47+
.. ---------------------------------------------------------------------------
48+
3649
.. _whatsnew_115.contributors:
3750

3851
Contributors

0 commit comments

Comments
 (0)