@@ -47,7 +47,7 @@ the `categories` array.
47
47
The categorical data type is useful in the following cases:
48
48
49
49
* A string variable consisting of only a few different values. Converting such a string
50
- variable to a categorical variable will save some memory, see :ref: `here<categorical.memory> `.
50
+ variable to a categorical variable will save some memory, see :ref: `here <categorical.memory >`.
51
51
* The lexical order of a variable is not the same as the logical order ("one", "two", "three").
52
52
By converting to a categorical and specifying an order on the categories, sorting and
53
53
min/max will use the logical order instead of the lexical order.
@@ -611,10 +611,13 @@ available ("missing value") or `np.nan` is a valid category.
611
611
pd.isnull(s)
612
612
s.fillna(" a" )
613
613
614
+ Gotchas
615
+ -------
616
+
614
617
.. _categorical.rfactor :
615
618
616
619
Differences to R's `factor `
617
- ---------------------------
620
+ ~~~~~~~~~~~~~~~~~~~~~~~~~~~
618
621
619
622
The following differences to R's factor functions can be observed:
620
623
@@ -629,15 +632,11 @@ The following differences to R's factor functions can be observed:
629
632
new categorical series will *not * remove unused categories but create a new categorical series
630
633
which is equal to the passed in one!
631
634
632
-
633
- Gotchas
634
- -------
635
-
636
- .. _categorical.memory :
637
-
638
635
Memory Usage
639
636
~~~~~~~~~~~~
640
637
638
+ .. _categorical.memory :
639
+
641
640
The memory usage of a ``Categorical `` is proportional to the length of the categories times the length of the data. In contrast,
642
641
the an ``object `` dtype is a fixed function of the length of the data.
643
642
@@ -738,7 +737,7 @@ basic type) and applying along columns will also convert to object.
738
737
df.apply(lambda row : type (row[" cats" ]), axis = 1 )
739
738
df.apply(lambda col : col.dtype, axis = 0 )
740
739
741
- No categorical index
740
+ No Categorical Index
742
741
~~~~~~~~~~~~~~~~~~~~
743
742
744
743
There is currently no index of type ``category ``, so setting the index to categorical column will
@@ -760,7 +759,7 @@ ordering of the categories:
760
759
https://github.com/pydata/pandas/issues/7629)
761
760
762
761
763
- Side effects
762
+ Side Effects
764
763
~~~~~~~~~~~~
765
764
766
765
Constructing a `Series ` from a `Categorical ` will not copy the input `Categorical `. This
0 commit comments