You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: doc/source/whatsnew/v0.19.0.txt
+37-50
Original file line number
Diff line number
Diff line change
@@ -847,7 +847,7 @@ Furthermore:
847
847
""""""""""""""""""""""""""""""""""""""""
848
848
849
849
``PeriodIndex`` now has its own ``period`` dtype. The ``period`` dtype is a
850
-
pandas extension dtype like ``category`` or :ref:`timezone aware dtype <timeseries.timezone_series>` (``datetime64[ns, tz]``). (:issue:`13941`).
850
+
pandas extension dtype like ``category`` or the :ref:`timezone aware dtype <timeseries.timezone_series>` (``datetime64[ns, tz]``). (:issue:`13941`).
851
851
As a consequence of this change, ``PeriodIndex`` no longer has an integer dtype:
852
852
853
853
Previous Behavior:
@@ -900,7 +900,7 @@ These result in ``pd.NaT`` without providing ``freq`` option.
900
900
pd.Period(None)
901
901
902
902
903
-
To be compat with ``Period`` addition and subtraction, ``pd.NaT`` now supports addition and subtraction with ``int``. Previously it raises ``ValueError``.
903
+
To be compatible with ``Period`` addition and subtraction, ``pd.NaT`` now supports addition and subtraction with ``int``. Previously it raised ``ValueError``.
904
904
905
905
Previous Behavior:
906
906
@@ -920,8 +920,8 @@ New Behavior:
920
920
``PeriodIndex.values`` now returns array of ``Period`` object
``MultiIndex.from_arrays`` and ``MultiIndex.from_product`` will now preserve categorical dtype
1062
1062
in ``MultiIndex`` levels. (:issue:`13743`, :issue:`13854`)
@@ -1078,7 +1078,7 @@ Previous Behavior:
1078
1078
In [5]: midx.get_level_values[0]
1079
1079
Out[5]: Index(['a', 'b'], dtype='object')
1080
1080
1081
-
New Behavior:
1081
+
New Behavior: the single level is now a ``CategoricalIndex``:
1082
1082
1083
1083
.. ipython:: python
1084
1084
@@ -1131,8 +1131,8 @@ New Behavior:
1131
1131
``read_csv`` will progressively enumerate chunks
1132
1132
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
1133
1133
1134
-
When :func:`read_csv` is called with ``chunksize='n'`` and without specifying an index,
1135
-
each chunk used to have an independently generated index from `0`` to ``n-1``.
1134
+
When :func:`read_csv` is called with ``chunksize=n`` and without specifying an index,
1135
+
each chunk used to have an independently generated index from ``0`` to ``n-1``.
1136
1136
They are now given instead a progressive index, starting from ``0`` for the first chunk,
1137
1137
from ``n`` for the second, and so on, so that, when concatenated, they are identical to
1138
1138
the result of calling :func:`read_csv` without the ``chunksize=`` argument.
@@ -1167,13 +1167,12 @@ Sparse Changes
1167
1167
1168
1168
These changes allow pandas to handle sparse data with more dtypes, and for work to make a smoother experience with data handling.
1169
1169
1170
-
1171
1170
``int64`` and ``bool`` support enhancements
1172
1171
"""""""""""""""""""""""""""""""""""""""""""
1173
1172
1174
-
Sparse data structures now gained enhanced support of ``int64`` and ``bool`` ``dtype`` (:issue:`667`, :issue:`13849`)
1173
+
Sparse data structures now gained enhanced support of ``int64`` and ``bool`` ``dtype`` (:issue:`667`, :issue:`13849`).
1175
1174
1176
-
Previously, sparse data were ``float64`` dtype by default, even if all inputs were ``int`` or ``bool`` dtype. You had to specify ``dtype`` explicitly to create sparse data with ``int64`` dtype. Also, ``fill_value`` had to be specified explicitly becuase it's default was ``np.nan`` which doesn't appear in ``int64`` or ``bool`` data.
1175
+
Previously, sparse data were ``float64`` dtype by default, even if all inputs were of ``int`` or ``bool`` dtype. You had to specify ``dtype`` explicitly to create sparse data with ``int64`` dtype. Also, ``fill_value`` had to be specified explicitly because the default was ``np.nan`` which doesn't appear in ``int64`` or ``bool`` data.
1177
1176
1178
1177
.. code-block:: ipython
1179
1178
@@ -1200,9 +1199,9 @@ Previously, sparse data were ``float64`` dtype by default, even if all inputs we
1200
1199
IntIndex
1201
1200
Indices: array([0, 1], dtype=int32)
1202
1201
1203
-
As of v0.19.0, sparse data keeps the input dtype, and assign more appropriate ``fill_value`` default (``0`` for ``int64`` dtype, ``False`` for ``bool`` dtype).
1202
+
As of v0.19.0, sparse data keeps the input dtype, and uses more appropriate ``fill_value`` defaults (``0`` for ``int64`` dtype, ``False`` for ``bool`` dtype).
1204
1203
1205
-
.. ipython:: python
1204
+
.. ipython:: python
1206
1205
1207
1206
pd.SparseArray([1, 2, 0, 0], dtype=np.int64)
1208
1207
pd.SparseArray([True, False, False, False])
@@ -1214,29 +1213,29 @@ Operators now preserve dtypes
1214
1213
1215
1214
- Sparse data structure now can preserve ``dtype`` after arithmetic ops (:issue:`13848`)
1216
1215
1217
-
.. ipython:: python
1216
+
.. ipython:: python
1218
1217
1219
-
s = pd.SparseSeries([0, 2, 0, 1], fill_value=0, dtype=np.int64)
1220
-
s.dtype
1218
+
s = pd.SparseSeries([0, 2, 0, 1], fill_value=0, dtype=np.int64)
1219
+
s.dtype
1221
1220
1222
-
s + 1
1221
+
s + 1
1223
1222
1224
1223
- Sparse data structure now support ``astype`` to convert internal ``dtype`` (:issue:`13900`)
1225
1224
1226
-
.. ipython:: python
1225
+
.. ipython:: python
1227
1226
1228
-
s = pd.SparseSeries([1., 0., 2., 0.], fill_value=0)
1229
-
s
1230
-
s.astype(np.int64)
1227
+
s = pd.SparseSeries([1., 0., 2., 0.], fill_value=0)
1228
+
s
1229
+
s.astype(np.int64)
1231
1230
1232
-
``astype`` fails if data contains values which cannot be converted to specified ``dtype``.
1233
-
Note that the limitation is applied to ``fill_value`` which default is ``np.nan``.
1231
+
``astype`` fails if data contains values which cannot be converted to specified ``dtype``.
1232
+
Note that the limitation is applied to ``fill_value`` which default is ``np.nan``.
1234
1233
1235
-
.. code-block:: ipython
1234
+
.. code-block:: ipython
1236
1235
1237
-
In [7]: pd.SparseSeries([1., np.nan, 2., np.nan], fill_value=np.nan).astype(np.int64)
1238
-
Out[7]:
1239
-
ValueError: unable to coerce current fill_value nan to int64 dtype
1236
+
In [7]: pd.SparseSeries([1., np.nan, 2., np.nan], fill_value=np.nan).astype(np.int64)
1237
+
Out[7]:
1238
+
ValueError: unable to coerce current fill_value nan to int64 dtype
1240
1239
1241
1240
Other sparse fixes
1242
1241
""""""""""""""""""
@@ -1358,6 +1357,7 @@ Deprecations
1358
1357
1359
1358
Removal of prior version deprecations/changes
1360
1359
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
1360
+
1361
1361
- The ``SparsePanel`` class has been removed (:issue:`13778`)
1362
1362
- The ``pd.sandbox`` module has been removed in favor of the external library ``pandas-qt`` (:issue:`13670`)
1363
1363
- The ``pandas.io.data`` and ``pandas.io.wb`` modules are removed in favor of
@@ -1371,35 +1371,23 @@ Removal of prior version deprecations/changes
1371
1371
- ``DataFrame.to_sql()`` has dropped the ``mysql`` option for the ``flavor`` parameter (:issue:`13611`)
1372
1372
- ``Panel.shift()`` has dropped the ``lags`` parameter in favour of ``periods`` (:issue:`14041`)
1373
1373
- ``pd.Index`` has dropped the ``diff`` method in favour of ``difference`` (:issue:`13669`)
1374
-
1375
1374
- ``pd.DataFrame`` has dropped the ``to_wide`` method in favour of ``to_panel`` (:issue:`14039`)
1376
1375
- ``Series.to_csv`` has dropped the ``nanRep`` parameter in favor of ``na_rep`` (:issue:`13804`)
1377
1376
- ``Series.xs``, ``DataFrame.xs``, ``Panel.xs``, ``Panel.major_xs``, and ``Panel.minor_xs`` have dropped the ``copy`` parameter (:issue:`13781`)
1378
1377
- ``str.split`` has dropped the ``return_type`` parameter in favor of ``expand`` (:issue:`13701`)
1379
-
- Removal of the legacy time rules (offset aliases), deprecated since 0.17.0 (this has been alias since 0.8.0) (:issue:`13590`, :issue:`13868`)
1380
-
1381
-
Previous Behavior:
1382
-
1383
-
.. code-block:: ipython
1384
-
1385
-
In [2]: pd.date_range('2016-07-01', freq='W@MON', periods=3)
1386
-
pandas/tseries/frequencies.py:465: FutureWarning: Freq "W@MON" is deprecated, use "W-MON" as alternative.
Now legacy time rules raises ``ValueError``. For the list of currently supported offsets, see :ref:`here <timeseries.offset_aliases>`
1390
-
1378
+
- Removal of the legacy time rules (offset aliases), deprecated since 0.17.0 (this has been alias since 0.8.0) (:issue:`13590`, :issue:`13868`). Now legacy time rules raises ``ValueError``. For the list of currently supported offsets, see :ref:`here <timeseries.offset_aliases>`.
1391
1379
- The default value for the ``return_type`` parameter for ``DataFrame.plot.box`` and ``DataFrame.boxplot`` changed from ``None`` to ``"axes"``. These methods will now return a matplotlib axes by default instead of a dictionary of artists. See :ref:`here <visualization.box.return>` (:issue:`6581`).
1392
1380
- The ``tquery`` and ``uquery`` functions in the ``pandas.io.sql`` module are removed (:issue:`5950`).
1393
1381
1382
+
1394
1383
.. _whatsnew_0190.performance:
1395
1384
1396
1385
Performance Improvements
1397
1386
~~~~~~~~~~~~~~~~~~~~~~~~
1398
1387
1399
1388
- Improved performance of sparse ``IntIndex.intersect`` (:issue:`13082`)
1400
1389
- Improved performance of sparse arithmetic with ``BlockIndex`` when the number of blocks are large, though recommended to use ``IntIndex`` in such cases (:issue:`13082`)
1401
-
- increased performance of ``DataFrame.quantile()`` as it now operates per-block (:issue:`11623`)
1402
-
1390
+
- Improved performance of ``DataFrame.quantile()`` as it now operates per-block (:issue:`11623`)
1403
1391
- Improved performance of float64 hash table operations, fixing some very slow indexing and groupby operations in python 3 (:issue:`13166`, :issue:`13334`)
1404
1392
- Improved performance of ``DataFrameGroupBy.transform`` (:issue:`12737`)
1405
1393
- Improved performance of ``Index`` and ``Series`` ``.duplicated`` (:issue:`10235`)
@@ -1410,7 +1398,6 @@ Performance Improvements
1410
1398
- Improved performance of ``factorize`` of datetime with timezone (:issue:`13750`)
0 commit comments