Skip to content

Commit f67f6bb

Browse files
committed
Merge remote-tracking branch 'upstream/master' into styler_column_style_enh
# Conflicts: # doc/source/whatsnew/v1.2.0.rst # pandas/tests/io/formats/test_style.py
2 parents a042654 + fa92ece commit f67f6bb

File tree

102 files changed

+1361
-591
lines changed

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

102 files changed

+1361
-591
lines changed

ci/code_checks.sh

+5
Original file line numberDiff line numberDiff line change
@@ -230,6 +230,11 @@ if [[ -z "$CHECK" || "$CHECK" == "patterns" ]]; then
230230
invgrep -R --include="*.py" -P '# type: (?!ignore)' pandas
231231
RET=$(($RET + $?)) ; echo $MSG "DONE"
232232

233+
# https://github.com/python/mypy/issues/7384
234+
# MSG='Check for missing error codes with # type: ignore' ; echo $MSG
235+
# invgrep -R --include="*.py" -P '# type: ignore(?!\[)' pandas
236+
# RET=$(($RET + $?)) ; echo $MSG "DONE"
237+
233238
MSG='Check for use of foo.__class__ instead of type(foo)' ; echo $MSG
234239
invgrep -R --include=*.{py,pyx} '\.__class__' pandas
235240
RET=$(($RET + $?)) ; echo $MSG "DONE"

doc/source/ecosystem.rst

+7
Original file line numberDiff line numberDiff line change
@@ -80,6 +80,11 @@ ML pipeline.
8080

8181
Featuretools is a Python library for automated feature engineering built on top of pandas. It excels at transforming temporal and relational datasets into feature matrices for machine learning using reusable feature engineering "primitives". Users can contribute their own primitives in Python and share them with the rest of the community.
8282

83+
`Compose <https://github.com/FeatureLabs/compose>`__
84+
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
85+
86+
Compose is a machine learning tool for labeling data and prediction engineering. It allows you to structure the labeling process by parameterizing prediction problems and transforming time-driven relational data into target values with cutoff times that can be used for supervised learning.
87+
8388
.. _ecosystem.visualization:
8489

8590
Visualization
@@ -445,6 +450,7 @@ Library Accessor Classes Description
445450
`pdvega`_ ``vgplot`` ``Series``, ``DataFrame`` Provides plotting functions from the Altair_ library.
446451
`pandas_path`_ ``path`` ``Index``, ``Series`` Provides `pathlib.Path`_ functions for Series.
447452
`pint-pandas`_ ``pint`` ``Series``, ``DataFrame`` Provides units support for numeric Series and DataFrames.
453+
`composeml`_ ``slice`` ``DataFrame`` Provides a generator for enhanced data slicing.
448454
=============== ========== ========================= ===============================================================
449455

450456
.. _cyberpandas: https://cyberpandas.readthedocs.io/en/latest
@@ -453,3 +459,4 @@ Library Accessor Classes Description
453459
.. _pandas_path: https://github.com/drivendataorg/pandas-path/
454460
.. _pathlib.Path: https://docs.python.org/3/library/pathlib.html
455461
.. _pint-pandas: https://github.com/hgrecco/pint-pandas
462+
.. _composeml: https://github.com/FeatureLabs/compose

doc/source/user_guide/io.rst

+22-4
Original file line numberDiff line numberDiff line change
@@ -1064,6 +1064,23 @@ DD/MM/YYYY instead. For convenience, a ``dayfirst`` keyword is provided:
10641064
pd.read_csv('tmp.csv', parse_dates=[0])
10651065
pd.read_csv('tmp.csv', dayfirst=True, parse_dates=[0])
10661066
1067+
Writing CSVs to binary file objects
1068+
+++++++++++++++++++++++++++++++++++
1069+
1070+
.. versionadded:: 1.2.0
1071+
1072+
``df.to_csv(..., mode="w+b")`` allows writing a CSV to a file object
1073+
opened binary mode. For this to work, it is necessary that ``mode``
1074+
contains a "b":
1075+
1076+
.. ipython:: python
1077+
1078+
import io
1079+
1080+
data = pd.DataFrame([0, 1, 2])
1081+
buffer = io.BytesIO()
1082+
data.to_csv(buffer, mode="w+b", encoding="utf-8", compression="gzip")
1083+
10671084
.. _io.float_precision:
10681085

10691086
Specifying method for floating-point conversion
@@ -3441,10 +3458,11 @@ for some advanced strategies
34413458

34423459
.. warning::
34433460

3444-
pandas requires ``PyTables`` >= 3.0.0.
3445-
There is a indexing bug in ``PyTables`` < 3.2 which may appear when querying stores using an index.
3446-
If you see a subset of results being returned, upgrade to ``PyTables`` >= 3.2.
3447-
Stores created previously will need to be rewritten using the updated version.
3461+
Pandas uses PyTables for reading and writing HDF5 files, which allows
3462+
serializing object-dtype data with pickle. Loading pickled data received from
3463+
untrusted sources can be unsafe.
3464+
3465+
See: https://docs.python.org/3/library/pickle.html for more.
34483466

34493467
.. ipython:: python
34503468
:suppress:

doc/source/whatsnew/v0.22.0.rst

+3-3
Original file line numberDiff line numberDiff line change
@@ -1,7 +1,7 @@
11
.. _whatsnew_0220:
22

3-
v0.22.0 (December 29, 2017)
4-
---------------------------
3+
Version 0.22.0 (December 29, 2017)
4+
----------------------------------
55

66
{{ header }}
77

@@ -96,7 +96,7 @@ returning ``1`` instead.
9696
These changes affect :meth:`DataFrame.sum` and :meth:`DataFrame.prod` as well.
9797
Finally, a few less obvious places in pandas are affected by this change.
9898

99-
Grouping by a categorical
99+
Grouping by a Categorical
100100
^^^^^^^^^^^^^^^^^^^^^^^^^
101101

102102
Grouping by a ``Categorical`` and summing now returns ``0`` instead of

doc/source/whatsnew/v0.23.0.rst

+11-11
Original file line numberDiff line numberDiff line change
@@ -86,8 +86,8 @@ Please note that the string `index` is not supported with the round trip format,
8686
.. _whatsnew_0230.enhancements.assign_dependent:
8787

8888

89-
``.assign()`` accepts dependent arguments
90-
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
89+
Method ``.assign()`` accepts dependent arguments
90+
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
9191

9292
The :func:`DataFrame.assign` now accepts dependent keyword arguments for python version later than 3.6 (see also `PEP 468
9393
<https://www.python.org/dev/peps/pep-0468/>`_). Later keyword arguments may now refer to earlier ones if the argument is a callable. See the
@@ -244,7 +244,7 @@ documentation. If you build an extension array, publicize it on our
244244

245245
.. _whatsnew_0230.enhancements.categorical_grouping:
246246

247-
New ``observed`` keyword for excluding unobserved categories in ``groupby``
247+
New ``observed`` keyword for excluding unobserved categories in ``GroupBy``
248248
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
249249

250250
Grouping by a categorical includes the unobserved categories in the output.
@@ -360,8 +360,8 @@ Fill all consecutive outside values in both directions
360360
361361
.. _whatsnew_0210.enhancements.get_dummies_dtype:
362362

363-
``get_dummies`` now supports ``dtype`` argument
364-
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
363+
Function ``get_dummies`` now supports ``dtype`` argument
364+
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
365365

366366
The :func:`get_dummies` now accepts a ``dtype`` argument, which specifies a dtype for the new columns. The default remains uint8. (:issue:`18330`)
367367

@@ -388,8 +388,8 @@ See the :ref:`documentation here <timedeltas.mod_divmod>`. (:issue:`19365`)
388388
389389
.. _whatsnew_0230.enhancements.ran_inf:
390390

391-
``.rank()`` handles ``inf`` values when ``NaN`` are present
392-
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
391+
Method ``.rank()`` handles ``inf`` values when ``NaN`` are present
392+
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
393393

394394
In previous versions, ``.rank()`` would assign ``inf`` elements ``NaN`` as their ranks. Now ranks are calculated properly. (:issue:`6945`)
395395

@@ -587,7 +587,7 @@ If installed, we now require:
587587

588588
.. _whatsnew_0230.api_breaking.dict_insertion_order:
589589

590-
Instantiation from dicts preserves dict insertion order for python 3.6+
590+
Instantiation from dicts preserves dict insertion order for Python 3.6+
591591
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
592592

593593
Until Python 3.6, dicts in Python had no formally defined ordering. For Python
@@ -1365,8 +1365,8 @@ MultiIndex
13651365
- Bug in indexing where nested indexers having only numpy arrays are handled incorrectly (:issue:`19686`)
13661366

13671367

1368-
I/O
1369-
^^^
1368+
IO
1369+
^^
13701370

13711371
- :func:`read_html` now rewinds seekable IO objects after parse failure, before attempting to parse with a new parser. If a parser errors and the object is non-seekable, an informative error is raised suggesting the use of a different parser (:issue:`17975`)
13721372
- :meth:`DataFrame.to_html` now has an option to add an id to the leading `<table>` tag (:issue:`8496`)
@@ -1403,7 +1403,7 @@ Plotting
14031403
- :func:`DataFrame.plot` now supports multiple columns to the ``y`` argument (:issue:`19699`)
14041404

14051405

1406-
Groupby/resample/rolling
1406+
GroupBy/resample/rolling
14071407
^^^^^^^^^^^^^^^^^^^^^^^^
14081408

14091409
- Bug when grouping by a single column and aggregating with a class like ``list`` or ``tuple`` (:issue:`18079`)

doc/source/whatsnew/v0.24.0.rst

+6-6
Original file line numberDiff line numberDiff line change
@@ -277,8 +277,8 @@ For earlier versions this can be done using the following.
277277
278278
.. _whatsnew_0240.enhancements.read_html:
279279

280-
``read_html`` Enhancements
281-
^^^^^^^^^^^^^^^^^^^^^^^^^^
280+
Function ``read_html`` enhancements
281+
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
282282

283283
:func:`read_html` previously ignored ``colspan`` and ``rowspan`` attributes.
284284
Now it understands them, treating them as sequences of cells with the same
@@ -1371,7 +1371,7 @@ the object's ``freq`` attribute (:issue:`21939`, :issue:`23878`).
13711371
13721372
.. _whatsnew_0240.deprecations.integer_tz:
13731373

1374-
Passing integer data and a timezone to datetimeindex
1374+
Passing integer data and a timezone to DatetimeIndex
13751375
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
13761376

13771377
The behavior of :class:`DatetimeIndex` when passed integer data and
@@ -1769,8 +1769,8 @@ MultiIndex
17691769
- :class:`MultiIndex` has gained the :meth:`MultiIndex.from_frame`, it allows constructing a :class:`MultiIndex` object from a :class:`DataFrame` (:issue:`22420`)
17701770
- Fix ``TypeError`` in Python 3 when creating :class:`MultiIndex` in which some levels have mixed types, e.g. when some labels are tuples (:issue:`15457`)
17711771

1772-
I/O
1773-
^^^
1772+
IO
1773+
^^
17741774

17751775
- Bug in :func:`read_csv` in which a column specified with ``CategoricalDtype`` of boolean categories was not being correctly coerced from string values to booleans (:issue:`20498`)
17761776
- Bug in :func:`read_csv` in which unicode column names were not being properly recognized with Python 2.x (:issue:`13253`)
@@ -1827,7 +1827,7 @@ Plotting
18271827
- Bug in :func:`DataFrame.plot.bar` caused bars to use multiple colors instead of a single one (:issue:`20585`)
18281828
- Bug in validating color parameter caused extra color to be appended to the given color array. This happened to multiple plotting functions using matplotlib. (:issue:`20726`)
18291829

1830-
Groupby/resample/rolling
1830+
GroupBy/resample/rolling
18311831
^^^^^^^^^^^^^^^^^^^^^^^^
18321832

18331833
- Bug in :func:`pandas.core.window.Rolling.min` and :func:`pandas.core.window.Rolling.max` with ``closed='left'``, a datetime-like index and only one entry in the series leading to segfault (:issue:`24718`)

doc/source/whatsnew/v0.24.2.rst

-1
Original file line numberDiff line numberDiff line change
@@ -51,7 +51,6 @@ Bug fixes
5151

5252
- Bug where calling :meth:`Series.replace` on categorical data could return a ``Series`` with incorrect dimensions (:issue:`24971`)
5353
-
54-
-
5554

5655
**Reshaping**
5756

doc/source/whatsnew/v1.1.1.rst

+14-2
Original file line numberDiff line numberDiff line change
@@ -15,9 +15,11 @@ including other versions of pandas.
1515
Fixed regressions
1616
~~~~~~~~~~~~~~~~~
1717

18+
- Fixed regression where :meth:`DataFrame.to_numpy` would raise a ``RuntimeError`` for mixed dtypes when converting to ``str`` (:issue:`35455`)
1819
- Fixed regression where :func:`read_csv` would raise a ``ValueError`` when ``pandas.options.mode.use_inf_as_na`` was set to ``True`` (:issue:`35493`).
19-
-
20-
-
20+
- Fixed regression in :class:`pandas.core.groupby.RollingGroupby` where column selection was ignored (:issue:`35486`)
21+
- Fixed regression in :meth:`DataFrame.shift` with ``axis=1`` and heterogeneous dtypes (:issue:`35488`)
22+
- Fixed regression in ``.groupby(..).rolling(..)`` where a segfault would occur with ``center=True`` and an odd number of values (:issue:`35552`)
2123

2224
.. ---------------------------------------------------------------------------
2325
@@ -26,6 +28,7 @@ Fixed regressions
2628
Bug fixes
2729
~~~~~~~~~
2830

31+
- Bug in ``Styler`` whereby `cell_ids` argument had no effect due to other recent changes (:issue:`35588`).
2932

3033
Categorical
3134
^^^^^^^^^^^
@@ -38,6 +41,11 @@ Categorical
3841
-
3942
-
4043

44+
**Timedelta**
45+
46+
- Bug in :meth:`to_timedelta` fails when arg is a :class:`Series` with `Int64` dtype containing null values (:issue:`35574`)
47+
48+
4149
**Numeric**
4250

4351
-
@@ -49,6 +57,10 @@ Categorical
4957

5058
**Indexing**
5159

60+
- Bug in :meth:`Series.truncate` when trying to truncate a single-element series (:issue:`35544`)
61+
62+
**DataFrame**
63+
- Bug in :class:`DataFrame` constructor failing to raise ``ValueError`` in some cases when data and index have mismatched lengths (:issue:`33437`)
5264
-
5365

5466
.. ---------------------------------------------------------------------------

doc/source/whatsnew/v1.2.0.rst

+32-8
Original file line numberDiff line numberDiff line change
@@ -13,12 +13,32 @@ including other versions of pandas.
1313
Enhancements
1414
~~~~~~~~~~~~
1515

16+
.. _whatsnew_120.binary_handle_to_csv:
17+
18+
Support for binary file handles in ``to_csv``
19+
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
20+
21+
:meth:`to_csv` supports file handles in binary mode (:issue:`19827` and :issue:`35058`)
22+
with ``encoding`` (:issue:`13068` and :issue:`23854`) and ``compression`` (:issue:`22555`).
23+
``mode`` has to contain a ``b`` for binary handles to be supported.
24+
25+
For example:
26+
27+
.. ipython:: python
28+
29+
import io
30+
31+
data = pd.DataFrame([0, 1, 2])
32+
buffer = io.BytesIO()
33+
data.to_csv(buffer, mode="w+b", encoding="utf-8", compression="gzip")
34+
1635
.. _whatsnew_120.enhancements.other:
1736

1837
Other enhancements
1938
^^^^^^^^^^^^^^^^^^
2039

2140
- :meth:`Styler.set_table_styles` now allows the direct styling of rows and columns and can be chained (:issue:`35607`)
41+
- :class:`Index` with object dtype supports division and multiplication (:issue:`34160`)
2242
-
2343
-
2444

@@ -60,12 +80,13 @@ Categorical
6080

6181
Datetimelike
6282
^^^^^^^^^^^^
63-
-
83+
- Bug in :attr:`DatetimeArray.date` where a ``ValueError`` would be raised with a read-only backing array (:issue:`33530`)
84+
- Bug in ``NaT`` comparisons failing to raise ``TypeError`` on invalid inequality comparisons (:issue:`35046`)
6485
-
6586

6687
Timedelta
6788
^^^^^^^^^
68-
89+
- Bug in :class:`TimedeltaIndex`, :class:`Series`, and :class:`DataFrame` floor-division with ``timedelta64`` dtypes and ``NaT`` in the denominator (:issue:`35529`)
6990
-
7091
-
7192

@@ -109,19 +130,19 @@ Indexing
109130
Missing
110131
^^^^^^^
111132

112-
-
133+
- Bug in :meth:`SeriesGroupBy.transform` now correctly handles missing values for `dropna=False` (:issue:`35014`)
113134
-
114135

115136
MultiIndex
116137
^^^^^^^^^^
117138

118-
-
139+
- Bug in :meth:`DataFrame.xs` when used with :class:`IndexSlice` raises ``TypeError`` with message `Expected label or tuple of labels` (:issue:`35301`)
119140
-
120141

121142
I/O
122143
^^^
123144

124-
-
145+
- Bug in :meth:`to_csv` caused a ``ValueError`` when it was called with a filename in combination with ``mode`` containing a ``b`` (:issue:`35058`)
125146
-
126147

127148
Plotting
@@ -133,14 +154,17 @@ Plotting
133154
Groupby/resample/rolling
134155
^^^^^^^^^^^^^^^^^^^^^^^^
135156

157+
- Bug in :meth:`DataFrameGroupBy.count` and :meth:`SeriesGroupBy.sum` returning ``NaN`` for missing categories when grouped on multiple ``Categoricals``. Now returning ``0`` (:issue:`35028`)
158+
- Bug in :meth:`DataFrameGroupBy.apply` that would some times throw an erroneous ``ValueError`` if the grouping axis had duplicate entries (:issue:`16646`)
136159
-
137160
-
138-
161+
- Bug in :meth:`DataFrameGroupBy.apply` where a non-nuisance grouping column would be dropped from the output columns if another groupby method was called before ``.apply()`` (:issue:`34656`)
139162

140163
Reshaping
141164
^^^^^^^^^
142165

143-
-
166+
- Bug in :meth:`DataFrame.pivot_table` with ``aggfunc='count'`` or ``aggfunc='sum'`` returning ``NaN`` for missing categories when pivoted on a ``Categorical``. Now returning ``0`` (:issue:`31422`)
167+
- Bug in :func:`union_indexes` where input index names are not preserved in some cases. Affects :func:`concat` and :class:`DataFrame` constructor (:issue:`13475`)
144168
-
145169

146170
Sparse
@@ -166,4 +190,4 @@ Other
166190
.. _whatsnew_120.contributors:
167191

168192
Contributors
169-
~~~~~~~~~~~~
193+
~~~~~~~~~~~~

environment.yml

+1
Original file line numberDiff line numberDiff line change
@@ -109,3 +109,4 @@ dependencies:
109109
- pip:
110110
- git+https://github.com/pandas-dev/pydata-sphinx-theme.git@master
111111
- git+https://github.com/numpy/numpydoc
112+
- pyflakes>=2.2.0

pandas/_config/config.py

+1-1
Original file line numberDiff line numberDiff line change
@@ -462,7 +462,7 @@ def register_option(
462462
for k in path:
463463
# NOTE: tokenize.Name is not a public constant
464464
# error: Module has no attribute "Name" [attr-defined]
465-
if not re.match("^" + tokenize.Name + "$", k): # type: ignore
465+
if not re.match("^" + tokenize.Name + "$", k): # type: ignore[attr-defined]
466466
raise ValueError(f"{k} is not a valid identifier")
467467
if keyword.iskeyword(k):
468468
raise ValueError(f"{k} is a python keyword")

0 commit comments

Comments
 (0)