You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: doc/source/whatsnew/v0.17.0.txt
+135-101
Original file line number
Diff line number
Diff line change
@@ -38,10 +38,12 @@ Highlights include:
38
38
- The sorting API has been revamped to remove some long-time inconsistencies, see :ref:`here <whatsnew_0170.api_breaking.sorting>`
39
39
- Support for a ``datetime64[ns]`` with timezones as a first-class dtype, see :ref:`here <whatsnew_0170.tz>`
40
40
- The default for ``to_datetime`` will now be to ``raise`` when presented with unparseable formats,
41
-
previously this would return the original input, see :ref:`here <whatsnew_0170.api_breaking.to_datetime>`
41
+
previously this would return the original input. Also, date parse
42
+
functions now return consistent results. See :ref:`here <whatsnew_0170.api_breaking.to_datetime>`
42
43
- The default for ``dropna`` in ``HDFStore`` has changed to ``False``, to store by default all rows even
43
44
if they are all ``NaN``, see :ref:`here <whatsnew_0170.api_breaking.hdf_dropna>`
44
-
- Support for ``Series.dt.strftime`` to generate formatted strings for datetime-likes, see :ref:`here <whatsnew_0170.strftime>`
45
+
- Datetime accessor (``dt``) now supports ``Series.dt.strftime`` to generate formatted strings for datetime-likes, and ``Series.dt.total_seconds`` to generate each duration of the timedelta in seconds. See :ref:`here <whatsnew_0170.strftime>`
46
+
- ``Period`` and ``PeriodIndex`` can handle multiplied freq like ``3D``, which corresponding to 3 days span. See :ref:`here <whatsnew_0170.periodfreq>`
45
47
- Development installed versions of pandas will now have ``PEP440`` compliant version strings (:issue:`9518`)
46
48
- Development support for benchmarking with the `Air Speed Velocity library <https://github.com/spacetelescope/asv/>`_ (:issue:`8316`)
47
49
- Support for reading SAS xport files, see :ref:`here <whatsnew_0170.enhancements.sas_xport>`
@@ -169,8 +171,11 @@ Each method signature only includes relevant arguments. Currently, these are lim
169
171
170
172
.. _whatsnew_0170.strftime:
171
173
172
-
Support strftime for Datetimelikes
173
-
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
174
+
Additional methods for ``dt`` accessor
175
+
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
176
+
177
+
strftime
178
+
""""""""
174
179
175
180
We are now supporting a ``Series.dt.strftime`` method for datetime-likes to generate a formatted string (:issue:`10110`). Examples:
176
181
@@ -190,6 +195,18 @@ We are now supporting a ``Series.dt.strftime`` method for datetime-likes to gene
190
195
191
196
The string format is as the python standard library and details can be found `here <https://docs.python.org/2/library/datetime.html#strftime-and-strptime-behavior>`_
192
197
198
+
total_seconds
199
+
"""""""""""""
200
+
201
+
``pd.Series`` of type ``timedelta64`` has new method ``.dt.total_seconds()`` returning the duration of the timedelta in seconds (:issue:`10817`)
202
+
203
+
.. ipython:: python
204
+
205
+
# TimedeltaIndex
206
+
s = pd.Series(pd.timedelta_range('1 minutes', periods=4))
207
+
s
208
+
s.dt.total_seconds()
209
+
193
210
.. _whatsnew_0170.periodfreq:
194
211
195
212
Period Frequency Enhancement
@@ -240,7 +257,7 @@ See the :ref:`docs <io.sas>` for more details.
240
257
.. _whatsnew_0170.matheval:
241
258
242
259
Support for Math Functions in .eval()
243
-
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
260
+
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
244
261
245
262
:meth:`~pandas.eval` now supports calling math functions (:issue:`4893`)
246
263
@@ -307,7 +324,6 @@ has been changed to make this keyword unnecessary - the change is shown below.
307
324
Other enhancements
308
325
^^^^^^^^^^^^^^^^^^
309
326
310
-
311
327
- ``merge`` now accepts the argument ``indicator`` which adds a Categorical-type column (by default called ``_merge``) to the output object that takes on the values (:issue:`8790`)
For more, see the :ref:`updated docs <merging.indicator>`
328
344
329
-
- ``DataFrame`` has gained the ``nlargest`` and ``nsmallest`` methods (:issue:`10393`)
330
-
- SQL io functions now accept a SQLAlchemy connectable. (:issue:`7877`)
331
-
- Enable writing complex values to HDF stores when using table format (:issue:`10447`)
332
-
- Enable reading gzip compressed files via URL, either by explicitly setting the compression parameter or by inferring from the presence of the HTTP Content-Encoding header in the response (:issue:`8685`)
333
-
- Add a ``limit_direction`` keyword argument that works with ``limit`` to enable ``interpolate`` to fill ``NaN`` values forward, backward, or both (:issue:`9218` and :issue:`10420`)
334
-
335
-
.. ipython:: python
336
-
337
-
ser = pd.Series([np.nan, np.nan, 5, np.nan, np.nan, np.nan, 13])
338
-
ser.interpolate(limit=1, limit_direction='both')
339
-
340
-
- Round DataFrame to variable number of decimal places (:issue:`10568`).
345
+
- ``pd.merge`` will now allow duplicate column names if they are not merged upon (:issue:`10639`).
341
346
342
-
.. ipython :: python
347
+
- ``pd.pivot`` will now allow passing index as ``None`` (:issue:`3962`).
- ``concat`` will now use existing Series names if provided (:issue:`10698`).
349
350
350
-
- ``pd.read_sql`` and ``to_sql`` can accept database URI as ``con`` parameter (:issue:`10214`)
351
-
- Enable ``pd.read_hdf`` to be used without specifying a key when the HDF file contains a single dataset (:issue:`10443`)
352
-
- Enable writing Excel files in :ref:`memory <_io.excel_writing_buffer>` using StringIO/BytesIO (:issue:`7074`)
353
-
- Enable serialization of lists and dicts to strings in ``ExcelWriter`` (:issue:`8188`)
354
-
- Added functionality to use the ``base`` argument when resampling a ``TimeDeltaIndex`` (:issue:`10530`)
355
-
- ``DatetimeIndex`` can be instantiated using strings contains ``NaT`` (:issue:`7599`)
356
-
- The string parsing of ``to_datetime``, ``Timestamp`` and ``DatetimeIndex`` has been made consistent. (:issue:`7599`)
351
+
.. ipython:: python
357
352
358
-
Prior to v0.17.0, ``Timestamp`` and ``to_datetime`` may parse year-only datetime-string incorrectly using today's date, otherwise ``DatetimeIndex``
359
-
uses the beginning of the year. ``Timestamp`` and ``to_datetime`` may raise ``ValueError`` in some types of datetime-string which ``DatetimeIndex``
360
-
can parse, such as a quarterly string.
353
+
foo = pd.Series([1,2], name='foo')
354
+
bar = pd.Series([1,2])
355
+
baz = pd.Series([4,5])
361
356
362
-
Previous Behavior
357
+
Previous Behavior:
363
358
364
359
.. code-block:: python
365
360
366
-
In [1]: Timestamp('2012Q2')
367
-
Traceback
368
-
...
369
-
ValueError: Unable to parse 2012Q2
370
-
371
-
# Results in today's date.
372
-
In [2]: Timestamp('2014')
373
-
Out [2]: 2014-08-12 00:00:00
374
-
375
-
v0.17.0 can parse them as below. It works on ``DatetimeIndex`` also.
361
+
In [1] pd.concat([foo, bar, baz], 1)
362
+
Out[1]:
363
+
0 1 2
364
+
0 1 1 4
365
+
1 2 2 5
376
366
377
-
New Behaviour
367
+
New Behavior:
378
368
379
369
.. ipython:: python
380
370
381
-
Timestamp('2012Q2')
382
-
Timestamp('2014')
383
-
DatetimeIndex(['2012Q2', '2014'])
384
-
385
-
.. note::
386
-
387
-
If you want to perform calculations based on today's date, use ``Timestamp.now()`` and ``pandas.tseries.offsets``.
388
-
389
-
.. ipython:: python
390
-
391
-
import pandas.tseries.offsets as offsets
392
-
Timestamp.now()
393
-
Timestamp.now() + offsets.DateOffset(years=1)
394
-
395
-
- ``to_datetime`` can now accept ``yearfirst`` keyword (:issue:`7599`)
396
-
397
-
- ``pandas.tseries.offsets`` larger than the ``Day`` offset can now be used with with ``Series`` for addition/subtraction (:issue:`10699`). See the :ref:`Documentation <timeseries.offsetseries>` for more details.
398
-
399
-
- ``pd.Series`` of type ``timedelta64`` has new method ``.dt.total_seconds()`` returning the duration of the timedelta in seconds (:issue:`10817`)
371
+
pd.concat([foo, bar, baz], 1)
400
372
401
-
- ``pd.Timedelta.total_seconds()`` now returns Timedelta duration to ns precision (previously microsecond precision) (:issue:`10939`)
373
+
- ``DataFrame`` has gained the ``nlargest`` and ``nsmallest`` methods (:issue:`10393`)
402
374
403
-
- ``.as_blocks`` will now take a ``copy`` optional argument to return a copy of the data, default is to copy (no change in behavior from prior versions), (:issue:`9607`)
404
-
- ``regex`` argument to ``DataFrame.filter`` now handles numeric column names instead of raising ``ValueError`` (:issue:`10384`).
405
-
- ``pd.read_stata`` will now read Stata 118 type files. (:issue:`9882`)
375
+
- Add a ``limit_direction`` keyword argument that works with ``limit`` to enable ``interpolate`` to fill ``NaN`` values forward, backward, or both (:issue:`9218` and :issue:`10420`)
406
376
407
-
- ``pd.merge`` will now allow duplicate column names if they are not merged upon (:issue:`10639`).
377
+
.. ipython:: python
408
378
409
-
- ``pd.pivot`` will now allow passing index as ``None`` (:issue:`3962`).
379
+
ser = pd.Series([np.nan, np.nan, 5, np.nan, np.nan, np.nan, 13])
380
+
ser.interpolate(limit=1, limit_direction='both')
410
381
411
-
- ``read_sql_table`` will now allow reading from views (:issue:`10750`).
382
+
- Round DataFrame to variable number of decimal places (:issue:`10568`).
412
383
413
-
- ``msgpack`` submodule has been updated to 0.4.6 with backward compatibility (:issue:`10581`)
384
+
.. ipython :: python
414
385
415
-
- ``DataFrame.to_dict`` now accepts the *index* option in ``orient`` keyword argument (:issue:`10844`).
- ``drop_duplicates`` and ``duplicated`` now accept ``keep`` keyword to target first, last, and all duplicates. ``take_last`` keyword is deprecated, see :ref:`deprecations <whatsnew_0170.deprecations>` (:issue:`6511`, :issue:`8505`)
418
393
@@ -444,37 +419,50 @@ Other enhancements
444
419
445
420
``tolerance`` is also exposed by the lower level ``Index.get_indexer`` and ``Index.get_loc`` methods.
446
421
447
-
- Support pickling of ``Period`` objects (:issue:`10439`)
422
+
- Added functionality to use the ``base`` argument when resampling a ``TimeDeltaIndex`` (:issue:`10530`)
448
423
449
-
- ``DataFrame.apply`` will return a Series of dicts if the passed function returns a dict and ``reduce=True`` (:issue:`8735`).
424
+
- ``DatetimeIndex`` can be instantiated using strings contains ``NaT`` (:issue:`7599`)
425
+
426
+
- ``to_datetime`` can now accept ``yearfirst`` keyword (:issue:`7599`)
427
+
428
+
- ``pandas.tseries.offsets`` larger than the ``Day`` offset can now be used with with ``Series`` for addition/subtraction (:issue:`10699`). See the :ref:`Documentation <timeseries.offsetseries>` for more details.
429
+
430
+
- ``pd.Timedelta.total_seconds()`` now returns Timedelta duration to ns precision (previously microsecond precision) (:issue:`10939`)
450
431
451
432
- ``PeriodIndex`` now supports arithmetic with ``np.ndarray`` (:issue:`10638`)
452
433
453
-
- ``concat`` will now use existing Series names if provided (:issue:`10698`).
434
+
- Support pickling of ``Period`` objects (:issue:`10439`)
454
435
455
-
.. ipython:: python
436
+
- ``.as_blocks`` will now take a ``copy`` optional argument to return a copy of the data, default is to copy (no change in behavior from prior versions), (:issue:`9607`)
456
437
457
-
foo = pd.Series([1,2], name='foo')
458
-
bar = pd.Series([1,2])
459
-
baz = pd.Series([4,5])
438
+
- ``regex`` argument to ``DataFrame.filter`` now handles numeric column names instead of raising ``ValueError`` (:issue:`10384`).
460
439
461
-
Previous Behavior:
440
+
- Enable reading gzip compressed files via URL, either by explicitly setting the compression parameter or by inferring from the presence of the HTTP Content-Encoding header in the response (:issue:`8685`)
462
441
463
-
.. code-block:: python
442
+
- Enable writing Excel files in :ref:`memory <_io.excel_writing_buffer>` using StringIO/BytesIO (:issue:`7074`)
464
443
465
-
In [1] pd.concat([foo, bar, baz], 1)
466
-
Out[1]:
467
-
0 1 2
468
-
0 1 1 4
469
-
1 2 2 5
444
+
- Enable serialization of lists and dicts to strings in ``ExcelWriter`` (:issue:`8188`)
470
445
471
-
New Behavior:
446
+
- SQL io functions now accept a SQLAlchemy connectable. (:issue:`7877`)
472
447
473
-
.. ipython:: python
448
+
- ``pd.read_sql`` and ``to_sql`` can accept database URI as ``con`` parameter (:issue:`10214`)
474
449
475
-
pd.concat([foo, bar, baz], 1)
450
+
- ``read_sql_table`` will now allow reading from views (:issue:`10750`).
451
+
452
+
- Enable writing complex values to HDF stores when using table format (:issue:`10447`)
453
+
454
+
- Enable ``pd.read_hdf`` to be used without specifying a key when the HDF file contains a single dataset (:issue:`10443`)
455
+
456
+
- ``pd.read_stata`` will now read Stata 118 type files. (:issue:`9882`)
457
+
458
+
- ``msgpack`` submodule has been updated to 0.4.6 with backward compatibility (:issue:`10581`)
459
+
460
+
- ``DataFrame.to_dict`` now accepts the *index* option in ``orient`` keyword argument (:issue:`10844`).
461
+
462
+
- ``DataFrame.apply`` will return a Series of dicts if the passed function returns a dict and ``reduce=True`` (:issue:`8735`).
476
463
477
464
- Allow passing `kwargs` to the interpolation methods (:issue:`10378`).
465
+
478
466
- Improved error message when concatenating an empty iterable of dataframes (:issue:`9157`)
479
467
480
468
@@ -547,9 +535,13 @@ Previous Replacement
547
535
Changes to to_datetime and to_timedelta
548
536
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
549
537
550
-
The default for ``pd.to_datetime`` error handling has changed to ``errors='raise'``. In prior versions it was ``errors='ignore'``.
551
-
Furthermore, the ``coerce`` argument has been deprecated in favor of ``errors='coerce'``. This means that invalid parsing will raise rather that return the original
552
-
input as in previous versions. (:issue:`10636`)
538
+
Error handling
539
+
""""""""""""""
540
+
541
+
The default for ``pd.to_datetime`` error handling has changed to ``errors='raise'``.
542
+
In prior versions it was ``errors='ignore'``. Furthermore, the ``coerce`` argument
543
+
has been deprecated in favor of ``errors='coerce'``. This means that invalid parsing
544
+
will raise rather that return the original input as in previous versions. (:issue:`10636`)
553
545
554
546
Previous Behavior:
555
547
@@ -573,7 +565,7 @@ Of course you can coerce this as well.
To keep the previous behaviour, you can use ``errors='ignore'``:
568
+
To keep the previous behavior, you can use ``errors='ignore'``:
577
569
578
570
.. ipython:: python
579
571
@@ -582,6 +574,48 @@ To keep the previous behaviour, you can use ``errors='ignore'``:
582
574
Furthermore, ``pd.to_timedelta`` has gained a similar API, of ``errors='raise'|'ignore'|'coerce'``, and the ``coerce`` keyword
583
575
has been deprecated in favor of ``errors='coerce'``.
584
576
577
+
Consistent Parsing
578
+
""""""""""""""""""
579
+
580
+
The string parsing of ``to_datetime``, ``Timestamp`` and ``DatetimeIndex`` has
581
+
been made consistent. (:issue:`7599`)
582
+
583
+
Prior to v0.17.0, ``Timestamp`` and ``to_datetime`` may parse year-only datetime-string incorrectly using today's date, otherwise ``DatetimeIndex``
584
+
uses the beginning of the year. ``Timestamp`` and ``to_datetime`` may raise ``ValueError`` in some types of datetime-string which ``DatetimeIndex``
585
+
can parse, such as a quarterly string.
586
+
587
+
Previous Behavior:
588
+
589
+
.. code-block:: python
590
+
591
+
In [1]: Timestamp('2012Q2')
592
+
Traceback
593
+
...
594
+
ValueError: Unable to parse 2012Q2
595
+
596
+
# Results in today's date.
597
+
In [2]: Timestamp('2014')
598
+
Out [2]: 2014-08-12 00:00:00
599
+
600
+
v0.17.0 can parse them as below. It works on ``DatetimeIndex`` also.
601
+
602
+
New Behavior:
603
+
604
+
.. ipython:: python
605
+
606
+
Timestamp('2012Q2')
607
+
Timestamp('2014')
608
+
DatetimeIndex(['2012Q2', '2014'])
609
+
610
+
.. note::
611
+
612
+
If you want to perform calculations based on today's date, use ``Timestamp.now()`` and ``pandas.tseries.offsets``.
613
+
614
+
.. ipython:: python
615
+
616
+
import pandas.tseries.offsets as offsets
617
+
Timestamp.now()
618
+
Timestamp.now() + offsets.DateOffset(years=1)
585
619
586
620
.. _whatsnew_0170.api_breaking.convert_objects:
587
621
@@ -656,7 +690,7 @@ Operator equal on ``Index`` should behavior similarly to ``Series`` (:issue:`994
656
690
Starting in v0.17.0, comparing ``Index`` objects of different lengths will raise
657
691
a ``ValueError``. This is to be consistent with the behavior of ``Series``.
658
692
659
-
Previous behavior:
693
+
Previous Behavior:
660
694
661
695
.. code-block:: python
662
696
@@ -669,7 +703,7 @@ Previous behavior:
669
703
In [4]: pd.Index([1, 2, 3]) == pd.Index([1, 2])
670
704
Out[4]: False
671
705
672
-
New behavior:
706
+
New Behavior:
673
707
674
708
.. code-block:: python
675
709
@@ -706,14 +740,14 @@ Boolean comparisons of a ``Series`` vs ``None`` will now be equivalent to compar
706
740
s.iloc[1] = None
707
741
s
708
742
709
-
Previous behavior:
743
+
Previous Behavior:
710
744
711
745
.. code-block:: python
712
746
713
747
In [5]: s==None
714
748
TypeError: Could not compare <type 'NoneType'> type with Series
715
749
716
-
New behavior:
750
+
New Behavior:
717
751
718
752
.. ipython:: python
719
753
@@ -742,7 +776,7 @@ HDFStore dropna behavior
742
776
743
777
The default behavior for HDFStore write functions with ``format='table'`` is now to keep rows that are all missing. Previously, the behavior was to drop rows that were all missing save the index. The previous behavior can be replicated using the ``dropna=True`` option. (:issue:`9382`)
0 commit comments