Skip to content

WIP: categoricals as an internal CategoricalBlock GH5313 #7217

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 2 commits into from
Jul 14, 2014
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 2 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -88,3 +88,5 @@ doc/source/vbench
doc/source/vbench.rst
doc/source/index.rst
doc/build/html/index.html
# Windows specific leftover:
doc/tmp.sv
56 changes: 55 additions & 1 deletion doc/source/api.rst
Original file line number Diff line number Diff line change
Expand Up @@ -429,7 +429,7 @@ Time series-related
Series.tz_localize

String handling
~~~~~~~~~~~~~~~~~~~
~~~~~~~~~~~~~~~
``Series.str`` can be used to access the values of the series as
strings and apply several methods to it. Due to implementation
details the methods show up here as methods of the
Expand Down Expand Up @@ -468,6 +468,60 @@ details the methods show up here as methods of the
StringMethods.upper
StringMethods.get_dummies

.. _api.categorical:

Categorical
~~~~~~~~~~~

.. currentmodule:: pandas.core.categorical

If the Series is of dtype ``category``, ``Series.cat`` can be used to access the the underlying
``Categorical``. This data type is similar to the otherwise underlying numpy array
and has the following usable methods and properties (all available as
``Series.cat.<method_or_property>``).


.. autosummary::
:toctree: generated/

Categorical
Categorical.from_codes
Categorical.levels
Categorical.ordered
Categorical.reorder_levels
Categorical.remove_unused_levels
Categorical.min
Categorical.max
Categorical.mode
Categorical.describe

``np.asarray(categorical)`` works by implementing the array interface. Be aware, that this converts
the Categorical back to a numpy array, so levels and order information is not preserved!

.. autosummary::
:toctree: generated/

Categorical.__array__

To create compatibility with `pandas.Series` and `numpy` arrays, the following (non-API) methods
are also introduced.

.. autosummary::
:toctree: generated/

Categorical.from_array
Categorical.get_values
Categorical.copy
Categorical.dtype
Categorical.ndim
Categorical.sort
Categorical.equals
Categorical.unique
Categorical.order
Categorical.argsort
Categorical.fillna


Plotting
~~~~~~~~
.. currentmodule:: pandas
Expand Down
8 changes: 7 additions & 1 deletion doc/source/basics.rst
Original file line number Diff line number Diff line change
Expand Up @@ -1574,7 +1574,8 @@ dtypes:
'float64': np.arange(4.0, 7.0),
'bool1': [True, False, True],
'bool2': [False, True, False],
'dates': pd.date_range('now', periods=3).values})
'dates': pd.date_range('now', periods=3).values}),
'category': pd.Categorical(list("ABC))
df['tdeltas'] = df.dates.diff()
df['uint64'] = np.arange(3, 6).astype('u8')
df['other_dates'] = pd.date_range('20130101', periods=3).values
Expand Down Expand Up @@ -1630,6 +1631,11 @@ All numpy dtypes are subclasses of ``numpy.generic``:

subdtypes(np.generic)

.. note::

Pandas also defines an additional ``category`` dtype, which is not integrated into the normal
numpy hierarchy and wont show up with the above function.

.. note::

The ``include`` and ``exclude`` parameters must be non-string sequences.
Loading