Skip to content

Commit f3cfe4d

Browse files
Merge remote-tracking branch 'upstream/master' into bisect
2 parents 00e3ddc + 6ff2e7c commit f3cfe4d

File tree

135 files changed

+1752
-1028
lines changed

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

135 files changed

+1752
-1028
lines changed

.github/workflows/ci.yml

+34-28
Original file line numberDiff line numberDiff line change
@@ -2,74 +2,81 @@ name: CI
22

33
on:
44
push:
5-
branches: master
5+
branches: [master]
66
pull_request:
77
branches:
88
- master
99
- 1.2.x
1010

1111
env:
1212
ENV_FILE: environment.yml
13+
PANDAS_CI: 1
1314

1415
jobs:
1516
checks:
1617
name: Checks
1718
runs-on: ubuntu-latest
18-
steps:
19-
20-
- name: Setting conda path
21-
run: echo "${HOME}/miniconda3/bin" >> $GITHUB_PATH
19+
defaults:
20+
run:
21+
shell: bash -l {0}
2222

23+
steps:
2324
- name: Checkout
2425
uses: actions/checkout@v1
2526

2627
- name: Looking for unwanted patterns
2728
run: ci/code_checks.sh patterns
2829
if: always()
2930

30-
- name: Setup environment and build pandas
31-
run: ci/setup_env.sh
32-
if: always()
31+
- name: Cache conda
32+
uses: actions/cache@v2
33+
with:
34+
path: ~/conda_pkgs_dir
35+
key: ${{ runner.os }}-conda-${{ hashFiles('${{ env.ENV_FILE }}') }}
3336

34-
- name: Linting
37+
- uses: conda-incubator/setup-miniconda@v2
38+
with:
39+
activate-environment: pandas-dev
40+
channel-priority: strict
41+
environment-file: ${{ env.ENV_FILE }}
42+
use-only-tar-bz2: true
43+
44+
- name: Environment Detail
3545
run: |
36-
source activate pandas-dev
37-
ci/code_checks.sh lint
46+
conda info
47+
conda list
48+
49+
- name: Build Pandas
50+
run: |
51+
python setup.py build_ext -j 2
52+
python -m pip install -e . --no-build-isolation --no-use-pep517
53+
54+
- name: Linting
55+
run: ci/code_checks.sh lint
3856
if: always()
3957

4058
- name: Checks on imported code
41-
run: |
42-
source activate pandas-dev
43-
ci/code_checks.sh code
59+
run: ci/code_checks.sh code
4460
if: always()
4561

4662
- name: Running doctests
47-
run: |
48-
source activate pandas-dev
49-
ci/code_checks.sh doctests
63+
run: ci/code_checks.sh doctests
5064
if: always()
5165

5266
- name: Docstring validation
53-
run: |
54-
source activate pandas-dev
55-
ci/code_checks.sh docstrings
67+
run: ci/code_checks.sh docstrings
5668
if: always()
5769

5870
- name: Typing validation
59-
run: |
60-
source activate pandas-dev
61-
ci/code_checks.sh typing
71+
run: ci/code_checks.sh typing
6272
if: always()
6373

6474
- name: Testing docstring validation script
65-
run: |
66-
source activate pandas-dev
67-
pytest --capture=no --strict-markers scripts
75+
run: pytest --capture=no --strict-markers scripts
6876
if: always()
6977

7078
- name: Running benchmarks
7179
run: |
72-
source activate pandas-dev
7380
cd asv_bench
7481
asv check -E existing
7582
git remote add upstream https://github.com/pandas-dev/pandas.git
@@ -106,7 +113,6 @@ jobs:
106113
run: |
107114
source activate pandas-dev
108115
python web/pandas_web.py web/pandas --target-path=web/build
109-
110116
- name: Build documentation
111117
run: |
112118
source activate pandas-dev

.pre-commit-config.yaml

+3-2
Original file line numberDiff line numberDiff line change
@@ -145,11 +145,12 @@ repos:
145145
language: pygrep
146146
types_or: [python, cython]
147147
- id: unwanted-typing
148-
name: Check for use of comment-based annotation syntax and missing error codes
148+
name: Check for outdated annotation syntax and missing error codes
149149
entry: |
150150
(?x)
151151
\#\ type:\ (?!ignore)|
152-
\#\ type:\s?ignore(?!\[)
152+
\#\ type:\s?ignore(?!\[)|
153+
\)\ ->\ \"
153154
language: pygrep
154155
types: [python]
155156
- id: np-bool

doc/source/reference/style.rst

+2
Original file line numberDiff line numberDiff line change
@@ -39,6 +39,8 @@ Style application
3939
Styler.set_td_classes
4040
Styler.set_table_styles
4141
Styler.set_table_attributes
42+
Styler.set_tooltips
43+
Styler.set_tooltips_class
4244
Styler.set_caption
4345
Styler.set_properties
4446
Styler.set_uuid

doc/source/user_guide/enhancingperf.rst

+1-1
Original file line numberDiff line numberDiff line change
@@ -247,7 +247,7 @@ We've gotten another big improvement. Let's check again where the time is spent:
247247

248248
.. ipython:: python
249249
250-
%%prun -l 4 apply_integrate_f(df["a"].to_numpy(), df["b"].to_numpy(), df["N"].to_numpy())
250+
%prun -l 4 apply_integrate_f(df["a"].to_numpy(), df["b"].to_numpy(), df["N"].to_numpy())
251251
252252
As one might expect, the majority of the time is now spent in ``apply_integrate_f``,
253253
so if we wanted to make anymore efficiencies we must continue to concentrate our

doc/source/whatsnew/v0.8.0.rst

+1-1
Original file line numberDiff line numberDiff line change
@@ -176,7 +176,7 @@ New plotting methods
176176
Vytautas Jancauskas, the 2012 GSOC participant, has added many new plot
177177
types. For example, ``'kde'`` is a new option:
178178

179-
.. ipython:: python
179+
.. code-block:: python
180180
181181
s = pd.Series(
182182
np.concatenate((np.random.randn(1000), np.random.randn(1000) * 0.5 + 3))

doc/source/whatsnew/v1.2.0.rst

+1-1
Original file line numberDiff line numberDiff line change
@@ -751,7 +751,7 @@ Plotting
751751

752752
- Bug in :meth:`DataFrame.plot` was rotating xticklabels when ``subplots=True``, even if the x-axis wasn't an irregular time series (:issue:`29460`)
753753
- Bug in :meth:`DataFrame.plot` where a marker letter in the ``style`` keyword sometimes caused a ``ValueError`` (:issue:`21003`)
754-
- Bug in :meth:`DataFrame.plot.bar` and :meth:`Series.plot.bar` where ticks positions were assigned by value order instead of using the actual value for numeric or a smart ordering for string (:issue:`26186`, :issue:`11465`)
754+
- Bug in :meth:`DataFrame.plot.bar` and :meth:`Series.plot.bar` where ticks positions were assigned by value order instead of using the actual value for numeric or a smart ordering for string (:issue:`26186`, :issue:`11465`). This fix has been reverted in pandas 1.2.1, see :doc:`v1.2.1`
755755
- Twinned axes were losing their tick labels which should only happen to all but the last row or column of 'externally' shared axes (:issue:`33819`)
756756
- Bug in :meth:`Series.plot` and :meth:`DataFrame.plot` was throwing a :exc:`ValueError` when the Series or DataFrame was
757757
indexed by a :class:`.TimedeltaIndex` with a fixed frequency and the x-axis lower limit was greater than the upper limit (:issue:`37454`)

doc/source/whatsnew/v1.2.1.rst

+22-18
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,6 @@
11
.. _whatsnew_121:
22

3-
What's new in 1.2.1 (January ??, 2021)
3+
What's new in 1.2.1 (January 18, 2021)
44
--------------------------------------
55

66
These are the changes in pandas 1.2.1. See :ref:`release` for a full changelog
@@ -14,23 +14,29 @@ including other versions of pandas.
1414

1515
Fixed regressions
1616
~~~~~~~~~~~~~~~~~
17-
- Fixed regression in :meth:`to_csv` that created corrupted zip files when there were more rows than ``chunksize`` (:issue:`38714`)
18-
- Fixed regression in ``groupby().rolling()`` where :class:`MultiIndex` levels were dropped (:issue:`38523`)
19-
- Fixed regression in repr of float-like strings of an ``object`` dtype having trailing 0's truncated after the decimal (:issue:`38708`)
20-
- Fixed regression in :meth:`DataFrame.groupby()` with :class:`Categorical` grouping column not showing unused categories for ``grouped.indices`` (:issue:`38642`)
21-
- Fixed regression in :meth:`DataFrame.any` and :meth:`DataFrame.all` not returning a result for tz-aware ``datetime64`` columns (:issue:`38723`)
22-
- Fixed regression in :meth:`DataFrame.__setitem__` raising ``ValueError`` when expanding :class:`DataFrame` and new column is from type ``"0 - name"`` (:issue:`39010`)
23-
- Fixed regression in :meth:`.GroupBy.sem` where the presence of non-numeric columns would cause an error instead of being dropped (:issue:`38774`)
24-
- Fixed regression in :meth:`DataFrame.loc.__setitem__` raising ``ValueError`` when :class:`DataFrame` has unsorted :class:`MultiIndex` columns and indexer is a scalar (:issue:`38601`)
25-
- Fixed regression in :func:`read_excel` with non-rawbyte file handles (:issue:`38788`)
26-
- Fixed regression in :meth:`Rolling.skew` and :meth:`Rolling.kurt` modifying the object inplace (:issue:`38908`)
17+
- Fixed regression in :meth:`~DataFrame.to_csv` that created corrupted zip files when there were more rows than ``chunksize`` (:issue:`38714`)
2718
- Fixed regression in :meth:`read_csv` and other read functions were the encoding error policy (``errors``) did not default to ``"replace"`` when no encoding was specified (:issue:`38989`)
19+
- Fixed regression in :func:`read_excel` with non-rawbyte file handles (:issue:`38788`)
20+
- Fixed regression in :meth:`DataFrame.to_stata` not removing the created file when an error occured (:issue:`39202`)
21+
- Fixed regression in ``DataFrame.__setitem__`` raising ``ValueError`` when expanding :class:`DataFrame` and new column is from type ``"0 - name"`` (:issue:`39010`)
22+
- Fixed regression in setting with :meth:`DataFrame.loc` raising ``ValueError`` when :class:`DataFrame` has unsorted :class:`MultiIndex` columns and indexer is a scalar (:issue:`38601`)
23+
- Fixed regression in setting with :meth:`DataFrame.loc` raising ``KeyError`` with :class:`MultiIndex` and list-like columns indexer enlarging :class:`DataFrame` (:issue:`39147`)
24+
- Fixed regression in :meth:`~DataFrame.groupby()` with :class:`Categorical` grouping column not showing unused categories for ``grouped.indices`` (:issue:`38642`)
25+
- Fixed regression in :meth:`.GroupBy.sem` where the presence of non-numeric columns would cause an error instead of being dropped (:issue:`38774`)
26+
- Fixed regression in :meth:`.DataFrameGroupBy.diff` raising for ``int8`` and ``int16`` columns (:issue:`39050`)
27+
- Fixed regression in :meth:`DataFrame.groupby` when aggregating an ``ExtensionDType`` that could fail for non-numeric values (:issue:`38980`)
28+
- Fixed regression in :meth:`.Rolling.skew` and :meth:`.Rolling.kurt` modifying the object inplace (:issue:`38908`)
29+
- Fixed regression in :meth:`DataFrame.any` and :meth:`DataFrame.all` not returning a result for tz-aware ``datetime64`` columns (:issue:`38723`)
30+
- Fixed regression in :meth:`DataFrame.apply` with ``axis=1`` using str accessor in apply function (:issue:`38979`)
2831
- Fixed regression in :meth:`DataFrame.replace` raising ``ValueError`` when :class:`DataFrame` has dtype ``bytes`` (:issue:`38900`)
29-
- Fixed regression in :meth:`DataFrameGroupBy.diff` raising for ``int8`` and ``int16`` columns (:issue:`39050`)
32+
- Fixed regression in :meth:`Series.fillna` that raised ``RecursionError`` with ``datetime64[ns, UTC]`` dtype (:issue:`38851`)
33+
- Fixed regression in comparisons between ``NaT`` and ``datetime.date`` objects incorrectly returning ``True`` (:issue:`39151`)
34+
- Fixed regression in repr of float-like strings of an ``object`` dtype having trailing 0's truncated after the decimal (:issue:`38708`)
3035
- Fixed regression that raised ``AttributeError`` with PyArrow versions [0.16.0, 1.0.0) (:issue:`38801`)
31-
- Fixed regression in :meth:`DataFrame.groupby` when aggregating an :class:`ExtensionDType` that could fail for non-numeric values (:issue:`38980`)
32-
-
33-
-
36+
- Fixed regression in :func:`pandas.testing.assert_frame_equal` raising ``TypeError`` with ``check_like=True`` when :class:`Index` or columns have mixed dtype (:issue:`39168`)
37+
38+
We have reverted a commit that resulted in several plotting related regressions in pandas 1.2.0 (:issue:`38969`, :issue:`38736`, :issue:`38865`, :issue:`38947` and :issue:`39126`).
39+
As a result, bugs reported as fixed in pandas 1.2.0 related to inconsistent tick labeling in bar plots are again present (:issue:`26186` and :issue:`11465`)
3440

3541
.. ---------------------------------------------------------------------------
3642
@@ -41,7 +47,7 @@ Bug fixes
4147

4248
- Bug in :meth:`read_csv` with ``float_precision="high"`` caused segfault or wrong parsing of long exponent strings. This resulted in a regression in some cases as the default for ``float_precision`` was changed in pandas 1.2.0 (:issue:`38753`)
4349
- Bug in :func:`read_csv` not closing an opened file handle when a ``csv.Error`` or ``UnicodeDecodeError`` occurred while initializing (:issue:`39024`)
44-
-
50+
- Bug in :func:`pandas.testing.assert_index_equal` raising ``TypeError`` with ``check_order=False`` when :class:`Index` has mixed dtype (:issue:`39168`)
4551

4652
.. ---------------------------------------------------------------------------
4753
@@ -55,8 +61,6 @@ Other
5561
- Bumped minimum pymysql version to 0.8.1 to avoid test failures (:issue:`38344`)
5662
- Fixed build failure on MacOS 11 in Python 3.9.1 (:issue:`38766`)
5763
- Added reference to backwards incompatible ``check_freq`` arg of :func:`testing.assert_frame_equal` and :func:`testing.assert_series_equal` in :ref:`pandas 1.1.0 whats new <whatsnew_110.api_breaking.testing.check_freq>` (:issue:`34050`)
58-
-
59-
-
6064

6165
.. ---------------------------------------------------------------------------
6266

doc/source/whatsnew/v1.3.0.rst

+37-1
Original file line numberDiff line numberDiff line change
@@ -50,7 +50,9 @@ Other enhancements
5050
- :func:`pandas.read_excel` can now auto detect .xlsb files (:issue:`35416`)
5151
- :meth:`.Rolling.sum`, :meth:`.Expanding.sum`, :meth:`.Rolling.mean`, :meth:`.Expanding.mean`, :meth:`.Rolling.median`, :meth:`.Expanding.median`, :meth:`.Rolling.max`, :meth:`.Expanding.max`, :meth:`.Rolling.min`, and :meth:`.Expanding.min` now support ``Numba`` execution with the ``engine`` keyword (:issue:`38895`)
5252
- :meth:`DataFrame.apply` can now accept NumPy unary operators as strings, e.g. ``df.apply("sqrt")``, which was already the case for :meth:`Series.apply` (:issue:`39116`)
53-
- :meth:`DataFrame.apply` can now accept non-callable :class:`DataFrame` properties as strings, e.g. ``df.apply("size")``, which was already the case for :meth:`Series.apply` (:issue:`39116`)
53+
- :meth:`DataFrame.apply` can now accept non-callable DataFrame properties as strings, e.g. ``df.apply("size")``, which was already the case for :meth:`Series.apply` (:issue:`39116`)
54+
- :meth:`Series.apply` can now accept list-like or dictionary-like arguments that aren't lists or dictionaries, e.g. ``ser.apply(np.array(["sum", "mean"]))``, which was already the case for :meth:`DataFrame.apply` (:issue:`39140`)
55+
- :meth:`.Styler.set_tooltips` allows on hover tooltips to be added to styled HTML dataframes.
5456

5557
.. ---------------------------------------------------------------------------
5658
@@ -62,6 +64,36 @@ Notable bug fixes
6264
These are bug fixes that might have notable behavior changes.
6365

6466

67+
Preserve dtypes in :meth:`~pandas.DataFrame.combine_first`
68+
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
69+
70+
:meth:`~pandas.DataFrame.combine_first` will now preserve dtypes (:issue:`7509`)
71+
72+
.. ipython:: python
73+
74+
df1 = pd.DataFrame({"A": [1, 2, 3], "B": [1, 2, 3]}, index=[0, 1, 2])
75+
df1
76+
df2 = pd.DataFrame({"B": [4, 5, 6], "C": [1, 2, 3]}, index=[2, 3, 4])
77+
df2
78+
combined = df1.combine_first(df2)
79+
80+
*pandas 1.2.x*
81+
82+
.. code-block:: ipython
83+
84+
In [1]: combined.dtypes
85+
Out[2]:
86+
A float64
87+
B float64
88+
C float64
89+
dtype: object
90+
91+
*pandas 1.3.0*
92+
93+
.. ipython:: python
94+
95+
combined.dtypes
96+
6597
6698
.. _whatsnew_130.api_breaking.deps:
6799

@@ -186,6 +218,8 @@ Categorical
186218
- Bug in ``CategoricalIndex.reindex`` failed when ``Index`` passed with elements all in category (:issue:`28690`)
187219
- Bug where constructing a :class:`Categorical` from an object-dtype array of ``date`` objects did not round-trip correctly with ``astype`` (:issue:`38552`)
188220
- Bug in constructing a :class:`DataFrame` from an ``ndarray`` and a :class:`CategoricalDtype` (:issue:`38857`)
221+
- Bug in :meth:`DataFrame.reindex` was throwing ``IndexError`` when new index contained duplicates and old index was :class:`CategoricalIndex` (:issue:`38906`)
222+
- Bug in setting categorical values into an object-dtype column in a :class:`DataFrame` (:issue:`39136`)
189223
- Bug in :meth:`DataFrame.reindex` was raising ``IndexError`` when new index contained duplicates and old index was :class:`CategoricalIndex` (:issue:`38906`)
190224

191225
Datetimelike
@@ -246,6 +280,7 @@ Indexing
246280
- Bug in :meth:`DataFrame.iloc.__setitem__` and :meth:`DataFrame.loc.__setitem__` with mixed dtypes when setting with a dictionary value (:issue:`38335`)
247281
- Bug in :meth:`DataFrame.loc` dropping levels of :class:`MultiIndex` when :class:`DataFrame` used as input has only one row (:issue:`10521`)
248282
- Bug in setting ``timedelta64`` values into numeric :class:`Series` failing to cast to object dtype (:issue:`39086`)
283+
- Bug in setting :class:`Interval` values into a :class:`Series` or :class:`DataFrame` with mismatched :class:`IntervalDtype` incorrectly casting the new values to the existing dtype (:issue:`39120`)
249284

250285
Missing
251286
^^^^^^^
@@ -281,6 +316,7 @@ I/O
281316
- :func:`read_excel` now respects :func:`set_option` (:issue:`34252`)
282317
- Bug in :func:`read_csv` not switching ``true_values`` and ``false_values`` for nullable ``boolean`` dtype (:issue:`34655`)
283318
- Bug in :func:`read_json` when ``orient="split"`` does not maintain numeric string index (:issue:`28556`)
319+
- :meth:`read_sql` returned an empty generator if ``chunksize`` was no-zero and the query returned no results. Now returns a generator with a single empty dataframe (:issue:`34411`)
284320

285321
Period
286322
^^^^^^

environment.yml

+1-1
Original file line numberDiff line numberDiff line change
@@ -68,7 +68,7 @@ dependencies:
6868

6969
# unused (required indirectly may be?)
7070
- ipywidgets
71-
- nbformat
71+
- nbformat=5.0.8
7272
- notebook>=5.7.5
7373
- pip
7474

pandas/_libs/tslibs/nattype.pyx

+18
Original file line numberDiff line numberDiff line change
@@ -1,4 +1,7 @@
1+
import warnings
2+
13
from cpython.datetime cimport (
4+
PyDate_Check,
25
PyDateTime_Check,
36
PyDateTime_IMPORT,
47
PyDelta_Check,
@@ -125,6 +128,21 @@ cdef class _NaT(datetime):
125128
return NotImplemented
126129
return result
127130

131+
elif PyDate_Check(other):
132+
# GH#39151 don't defer to datetime.date object
133+
if op == Py_EQ:
134+
return False
135+
if op == Py_NE:
136+
return True
137+
warnings.warn(
138+
"Comparison of NaT with datetime.date is deprecated in "
139+
"order to match the standard library behavior. "
140+
"In a future version these will be considered non-comparable.",
141+
FutureWarning,
142+
stacklevel=1,
143+
)
144+
return False
145+
128146
return NotImplemented
129147

130148
def __add__(self, other):

pandas/_testing/__init__.py

+8
Original file line numberDiff line numberDiff line change
@@ -977,3 +977,11 @@ def loc(x):
977977

978978
def iloc(x):
979979
return x.iloc
980+
981+
982+
def at(x):
983+
return x.at
984+
985+
986+
def iat(x):
987+
return x.iat

pandas/_testing/asserters.py

+3-3
Original file line numberDiff line numberDiff line change
@@ -29,7 +29,7 @@
2929
Series,
3030
TimedeltaIndex,
3131
)
32-
from pandas.core.algorithms import take_1d
32+
from pandas.core.algorithms import safe_sort, take_1d
3333
from pandas.core.arrays import (
3434
DatetimeArray,
3535
ExtensionArray,
@@ -344,8 +344,8 @@ def _get_ilevel_values(index, level):
344344

345345
# If order doesn't matter then sort the index entries
346346
if not check_order:
347-
left = left.sort_values()
348-
right = right.sort_values()
347+
left = Index(safe_sort(left))
348+
right = Index(safe_sort(right))
349349

350350
# MultiIndex special comparison for little-friendly error messages
351351
if left.nlevels > 1:

0 commit comments

Comments
 (0)