You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
API: deprecate setting of .ordered directly (GH9347, GH9190)
add set_ordered method for setting ordered
default for Categorical is now to NOT order unless explicity specified
whatsnew doc updates for categorical api changes
add ability to specify keywords to astype for creation defaults
fix issue with grouping with sort=True on an unordered Categorical
update categorical.rst docs
test unsortable when ordered=True
v0.16.0.txt / release notes updates
clean up check for ordering
allow groupby to work on an unordered categorical
Copy file name to clipboardExpand all lines: doc/source/release.rst
+1
Original file line number
Diff line number
Diff line change
@@ -59,6 +59,7 @@ Highlights include:
59
59
- ``Series.to_coo/from_coo`` methods to interact with ``scipy.sparse``, see :ref:`here <whatsnew_0160.enhancements.sparse>`
60
60
- Backwards incompatible change to ``Timedelta`` to conform the ``.seconds`` attribute with ``datetime.timedelta``, see :ref:`here <whatsnew_0160.api_breaking.timedelta>`
61
61
- Changes to the ``.loc`` slicing API to conform with the behavior of ``.ix`` see :ref:`here <whatsnew_0160.api_breaking.indexing>`
62
+
- Changes to the default for ordering in the ``Categorical`` constructor, see :ref:`here <whatsnew_0160.api_breaking.categorical>`
62
63
63
64
See the :ref:`v0.16.0 Whatsnew <whatsnew_0160>` overview or the issue tracker on GitHub for an extensive list
64
65
of all API changes, enhancements and bugs that have been fixed in 0.16.0.
Copy file name to clipboardExpand all lines: doc/source/whatsnew/v0.16.0.txt
+145
Original file line number
Diff line number
Diff line change
@@ -13,6 +13,7 @@ users upgrade to this version.
13
13
* ``Series.to_coo/from_coo`` methods to interact with ``scipy.sparse``, see :ref:`here <whatsnew_0160.enhancements.sparse>`
14
14
* Backwards incompatible change to ``Timedelta`` to conform the ``.seconds`` attribute with ``datetime.timedelta``, see :ref:`here <whatsnew_0160.api_breaking.timedelta>`
15
15
* Changes to the ``.loc`` slicing API to conform with the behavior of ``.ix`` see :ref:`here <whatsnew_0160.api_breaking.indexing>`
16
+
* Changes to the default for ordering in the ``Categorical`` constructor, see :ref:`here <whatsnew_0160.api_breaking.categorical>`
16
17
17
18
- Check the :ref:`API Changes <whatsnew_0160.api>` and :ref:`deprecations <whatsnew_0160.deprecations>` before updating
18
19
@@ -366,6 +367,150 @@ API Changes
366
367
- ``Series.describe`` for categorical data will now give counts and frequencies of 0, not ``NaN``, for unused categories (:issue:`9443`)
367
368
368
369
370
+
Categorical Changes
371
+
~~~~~~~~~~~~~~~~~~~
372
+
373
+
.. _whatsnew_0160.api_breaking.categorical:
374
+
375
+
In prior versions, ``Categoricals`` that had an unspecified ordering (meaning no ``ordered`` keyword was passed) were defaulted as ``ordered`` Categoricals. Going forward, the ``ordered`` keyword in the ``Categorical`` constructor will default to ``False``. Ordering must now be explicit.
376
+
377
+
Furthermore, previously you *could* change the ``ordered`` attribute of a Categorical by just setting the attribute, e.g. ``cat.ordered=True``; This is now deprecated and you should use ``cat.as_ordered()`` or ``cat.as_unordered()``. These will by default return a **new** object and not modify the existing object. (:issue:`9347`, :issue:`9190`)
378
+
379
+
Previous Behavior
380
+
381
+
.. code-block:: python
382
+
383
+
In [3]: s = Series([0,1,2], dtype='category')
384
+
385
+
In [4]: s
386
+
Out[4]:
387
+
0 0
388
+
1 1
389
+
2 2
390
+
dtype: category
391
+
Categories (3, int64): [0 < 1 < 2]
392
+
393
+
In [5]: s.cat.ordered
394
+
Out[5]: True
395
+
396
+
In [6]: s.cat.ordered = False
397
+
398
+
In [7]: s
399
+
Out[7]:
400
+
0 0
401
+
1 1
402
+
2 2
403
+
dtype: category
404
+
Categories (3, int64): [0, 1, 2]
405
+
406
+
New Behavior
407
+
408
+
.. ipython:: python
409
+
410
+
s = Series([0,1,2], dtype='category')
411
+
s
412
+
s.cat.ordered
413
+
s = s.cat.as_ordered()
414
+
s
415
+
s.cat.ordered
416
+
417
+
# you can set in the constructor of the Categorical
418
+
s = Series(Categorical([0,1,2],ordered=True))
419
+
s
420
+
s.cat.ordered
421
+
422
+
For ease of creation of series of categorical data, we have added the ability to pass keywords when calling ``.astype()``. These are passed directly to the constructor.
423
+
424
+
.. ipython:: python
425
+
426
+
s = Series(["a","b","c","a"]).astype('category',ordered=True)
427
+
s
428
+
s = Series(["a","b","c","a"]).astype('category',categories=list('abcdef'),ordered=False)
429
+
s
430
+
431
+
.. warning::
432
+
433
+
This simple API change may have suprising effects if a user is relying on the previous defaulted behavior implicity. In particular,
434
+
sorting operations with a ``Categorical`` will now raise an error:
TypeError: Categorical is not ordered for operation argsort
442
+
you can use .as_ordered() to change the Categorical to an ordered one
443
+
444
+
The solution is to make 'A' orderable, e.g. ``df['A'] = df['A'].cat.as_ordered()``
445
+
446
+
447
+
Indexing Changes
448
+
~~~~~~~~~~~~~~~~
449
+
450
+
.. _whatsnew_0160.api_breaking.indexing:
451
+
452
+
The behavior of a small sub-set of edge cases for using ``.loc`` have changed (:issue:`8613`). Furthermore we have improved the content of the error messages that are raised:
453
+
454
+
- slicing with ``.loc`` where the start and/or stop bound is not found in the index is now allowed; this previously would raise a ``KeyError``. This makes the behavior the same as ``.ix`` in this case. This change is only for slicing, not when indexing with a single label.
455
+
456
+
.. ipython:: python
457
+
458
+
df = DataFrame(np.random.randn(5,4),
459
+
columns=list('ABCD'),
460
+
index=date_range('20130101',periods=5))
461
+
df
462
+
s = Series(range(5),[-2,-1,1,2,3])
463
+
s
464
+
465
+
Previous Behavior
466
+
467
+
.. code-block:: python
468
+
469
+
In [4]: df.loc['2013-01-02':'2013-01-10']
470
+
KeyError: 'stop bound [2013-01-10] is not in the [index]'
471
+
472
+
In [6]: s.loc[-10:3]
473
+
KeyError: 'start bound [-10] is not the [index]'
474
+
475
+
New Behavior
476
+
477
+
.. ipython:: python
478
+
479
+
df.loc['2013-01-02':'2013-01-10']
480
+
s.loc[-10:3]
481
+
482
+
- allow slicing with float-like values on an integer index for ``.ix``. Previously this was only enabled for ``.loc``:
483
+
484
+
Previous Behavior
485
+
486
+
.. code-block:: python
487
+
488
+
In [8]: s.ix[-1.0:2]
489
+
TypeError: the slice start value [-1.0] is not a proper indexer for this index type (Int64Index)
490
+
491
+
New Behavior
492
+
493
+
.. ipython:: python
494
+
495
+
s.ix[-1.0:2]
496
+
497
+
- provide a useful exception for indexing with an invalid type for that index when using ``.loc``. For example trying to use ``.loc`` on an index of type ``DatetimeIndex`` or ``PeriodIndex`` or ``TimedeltaIndex``, with an integer (or a float).
498
+
499
+
Previous Behavior
500
+
501
+
.. code-block:: python
502
+
503
+
In [4]: df.loc[2:3]
504
+
KeyError: 'start bound [2] is not the [index]'
505
+
506
+
New Behavior
507
+
508
+
.. code-block:: python
509
+
510
+
In [4]: df.loc[2:3]
511
+
TypeError: Cannot do slice indexing on <class 'pandas.tseries.index.DatetimeIndex'> with <type 'int'> keys
0 commit comments