@@ -51,7 +51,7 @@ The categorical data type is useful in the following cases:
51
51
variable to a categorical variable will save some memory, see :ref: `here <categorical.memory >`.
52
52
* The lexical order of a variable is not the same as the logical order ("one", "two", "three").
53
53
By converting to a categorical and specifying an order on the categories, sorting and
54
- min/max will use the logical order instead of the lexical order.
54
+ min/max will use the logical order instead of the lexical order, see :ref: ` here < catetgorical.sort >` .
55
55
* As a signal to other python libraries that this column should be treated as a categorical
56
56
variable (e.g. to use suitable statistical methods or plot types).
57
57
@@ -265,9 +265,11 @@ or simply set the categories to a predefined scale, use :func:`Categorical.set_c
265
265
intentionally or because it is misspelled or (under Python3) due to a type difference (e.g.,
266
266
numpys S1 dtype and python strings). This can result in surprising behaviour!
267
267
268
- Ordered or not...
268
+ Sorting and Order
269
269
-----------------
270
270
271
+ .. _categorical.sort :
272
+
271
273
If categorical data is ordered (``s.cat.ordered == True ``), then the order of the categories has a
272
274
meaning and certain operations are possible. If the categorical is unordered, a `TypeError ` is
273
275
raised.
@@ -296,9 +298,14 @@ This is even true for strings and numeric data:
296
298
s
297
299
s.min(), s.max()
298
300
301
+
302
+ Reordering
303
+ ~~~~~~~~~~
304
+
299
305
Reordering the categories is possible via the :func: `Categorical.reorder_categories ` and
300
306
the :func: `Categorical.set_categories ` methods. For :func: `Categorical.reorder_categories `, all
301
- old categories must be included in the new categories and no new categories are allowed.
307
+ old categories must be included in the new categories and no new categories are allowed. This will
308
+ necessarily make the sort order the same as the categories order.
302
309
303
310
.. ipython :: python
304
311
@@ -324,6 +331,24 @@ old categories must be included in the new categories and no new categories are
324
331
(e.g.``Series.median()``, which would need to compute the mean between two values if the length
325
332
of an array is even) do not work and raise a `TypeError `.
326
333
334
+ Multi Column Sorting
335
+ ~~~~~~~~~~~~~~~~~~~~
336
+
337
+ A categorical dtyped column will partcipate in a multi-column sort in a similar manner to other columns.
338
+ The ordering of the categorical is determined by the ``categories `` of that columns.
339
+
340
+ .. ipython :: python
341
+
342
+ dfs = DataFrame({' A' : Categorical(list (' bbeebbaa' ),categories = [' e' ,' a' ,' b' ]),
343
+ ' B' : [1 ,2 ,1 ,2 ,2 ,1 ,2 ,1 ] })
344
+ dfs.sort([' A' ,' B' ])
345
+
346
+ Reordering the ``categories ``, changes a future sort.
347
+
348
+ .. ipython :: python
349
+
350
+ dfs[' C' ] = dfs[' A' ].cat.reorder_categories([' a' ,' b' ,' e' ])
351
+ dfs.sort([' C' ,' B' ])
327
352
328
353
Comparisons
329
354
-----------
0 commit comments