Skip to content

Commit b870dee

Browse files
pilkibunjorisvandenbossche
pilkibun
authored andcommitted
DOC: tweak paragraph regarding cut and IntervalIndex (#27132)
1 parent f58a1fe commit b870dee

File tree

1 file changed

+8
-3
lines changed

1 file changed

+8
-3
lines changed

doc/source/user_guide/advanced.rst

+8-3
Original file line numberDiff line numberDiff line change
@@ -965,21 +965,26 @@ If you select a label *contained* within an interval, this will also select the
965965
df.loc[2.5]
966966
df.loc[[2.5, 3.5]]
967967
968-
``Interval`` and ``IntervalIndex`` are used by ``cut`` and ``qcut``:
968+
:func:`cut` and :func:`qcut` both return a ``Categorical`` object, and the bins they
969+
create are stored as an ``IntervalIndex`` in its ``.categories`` attribute.
969970

970971
.. ipython:: python
971972
972973
c = pd.cut(range(4), bins=2)
973974
c
974975
c.categories
975976
976-
Furthermore, ``IntervalIndex`` allows one to bin *other* data with these same
977-
bins, with ``NaN`` representing a missing value similar to other dtypes.
977+
:func:`cut` also accepts an ``IntervalIndex`` for its ``bins`` argument, which enables
978+
a useful pandas idiom. First, We call :func:`cut` with some data and ``bins`` set to a
979+
fixed number, to generate the bins. Then, we pass the values of ``.categories`` as the
980+
``bins`` argument in subsequent calls to :func:`cut`, supplying new data which will be
981+
binned into the same bins.
978982

979983
.. ipython:: python
980984
981985
pd.cut([0, 3, 5, 1], bins=c.categories)
982986
987+
Any value which falls outside all bins will be assigned a ``NaN`` value.
983988

984989
Generating ranges of intervals
985990
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

0 commit comments

Comments
 (0)