Skip to content

Commit 65a9976

Browse files
committed
DOC: v0.14.0 and indexing doc updates for mi slicing
DOC: release notes and issues for mi_slicing
1 parent de84842 commit 65a9976

File tree

3 files changed

+202
-26
lines changed

3 files changed

+202
-26
lines changed

doc/source/indexing.rst

+115-26
Original file line numberDiff line numberDiff line change
@@ -426,14 +426,14 @@ python/numpy allow slicing past the end of an array without an associated error.
426426
values. A single indexer that is out-of-bounds and drops the dimensions of the object will still raise
427427
``IndexError`` (:issue:`6296`). This could result in an empty axis (e.g. an empty DataFrame being returned)
428428

429-
.. ipython:: python
429+
.. ipython:: python
430430
431-
df = DataFrame(np.random.randn(5,2),columns=list('AB'))
432-
df
433-
df.iloc[[4,5,6]]
434-
df.iloc[4:6]
435-
df.iloc[:,2:3]
436-
df.iloc[:,1:3]
431+
dfl = DataFrame(np.random.randn(5,2),columns=list('AB'))
432+
dfl
433+
dfl.iloc[[4,5,6]]
434+
dfl.iloc[4:6]
435+
dfl.iloc[:,2:3]
436+
dfl.iloc[:,1:3]
437437
438438
.. _indexing.basics.partial_setting:
439439

@@ -1684,47 +1684,122 @@ of tuples:
16841684
Advanced indexing with hierarchical index
16851685
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
16861686

1687-
Syntactically integrating ``MultiIndex`` in advanced indexing with ``.ix`` is a
1687+
Syntactically integrating ``MultiIndex`` in advanced indexing with ``.loc/.ix`` is a
16881688
bit challenging, but we've made every effort to do so. for example the
16891689
following works as you would expect:
16901690

16911691
.. ipython:: python
16921692
16931693
df = df.T
16941694
df
1695-
df.ix['bar']
1696-
df.ix['bar', 'two']
1695+
df.loc['bar']
1696+
df.loc['bar', 'two']
16971697
1698-
"Partial" slicing also works quite nicely for the topmost level:
1698+
"Partial" slicing also works quite nicely.
16991699

17001700
.. ipython:: python
17011701
1702-
df.ix['baz':'foo']
1702+
df.loc['baz':'foo']
17031703
1704-
But lower levels cannot be sliced in this way, because the MultiIndex uses
1705-
its multiple index dimensions to slice along one dimension of your object:
1704+
You can slice with a 'range' of values, by providing a slice of tuples.
17061705

17071706
.. ipython:: python
17081707
1709-
df.ix[('baz', 'two'):('qux', 'one')]
1710-
df.ix[('baz', 'two'):'foo']
1708+
df.loc[('baz', 'two'):('qux', 'one')]
1709+
df.loc[('baz', 'two'):'foo']
17111710
17121711
Passing a list of labels or tuples works similar to reindexing:
17131712

17141713
.. ipython:: python
17151714
17161715
df.ix[[('bar', 'two'), ('qux', 'one')]]
17171716
1718-
The following does not work, and it's not clear if it should or not:
1717+
.. _indexing.mi_slicers:
17191718

1720-
::
1719+
Multiindexing using slicers
1720+
~~~~~~~~~~~~~~~~~~~~~~~~~~~
1721+
1722+
.. versionadded:: 0.14.0
1723+
1724+
In 0.14.0 we added a new way to slice multi-indexed objects.
1725+
You can slice a multi-index by providing multiple indexers.
1726+
1727+
You can provide any of the selectors as if you are indexing by label, see :ref:`Selection by Label <indexing.label>`,
1728+
including slices, lists of labels, labels, and boolean indexers.
1729+
1730+
You can use ``slice(None)`` to select all the contents of *that* level. You do not need to specify all the
1731+
*deeper* levels, they will be implied as ``slice(None)``.
1732+
1733+
As usual, **both sides** of the slicers are included as this is label indexing.
1734+
1735+
.. warning::
1736+
1737+
You should specify all axes in the ``.loc`` specifier, meaning the indexer for the **index** and
1738+
for the **columns**. Their are some ambiguous cases where the passed indexer could be mis-interpreted
1739+
as indexing *both* axes, rather than into say the MuliIndex for the rows.
1740+
1741+
You should do this:
1742+
1743+
.. code-block:: python
1744+
1745+
df.loc[(slice('A1','A3'),.....,:]
1746+
1747+
rather than this:
1748+
1749+
.. code-block:: python
1750+
1751+
df.loc[(slice('A1','A3'),.....]
1752+
1753+
.. warning::
1754+
1755+
You will need to make sure that the selection axes are fully lexsorted!
1756+
1757+
.. ipython:: python
1758+
1759+
def mklbl(prefix,n):
1760+
return ["%s%s" % (prefix,i) for i in range(n)]
1761+
1762+
miindex = MultiIndex.from_product([mklbl('A',4),
1763+
mklbl('B',2),
1764+
mklbl('C',4),
1765+
mklbl('D',2)])
1766+
micolumns = MultiIndex.from_tuples([('a','foo'),('a','bar'),
1767+
('b','foo'),('b','bah')],
1768+
names=['lvl0', 'lvl1'])
1769+
dfmi = DataFrame(np.arange(len(miindex)*len(micolumns)).reshape((len(miindex),len(micolumns))),
1770+
index=miindex,
1771+
columns=micolumns).sortlevel().sortlevel(axis=1)
1772+
dfmi
1773+
1774+
.. ipython:: python
1775+
1776+
dfmi.loc[(slice('A1','A3'),slice(None), ['C1','C3']),:]
1777+
dfmi.loc[(slice(None),slice(None), ['C1','C3']),:]
17211778
1722-
>>> df.ix[['bar', 'qux']]
1779+
It is possible to perform quite complicated selections using this method on multiple
1780+
axes at the same time.
17231781
1724-
The code for implementing ``.ix`` makes every attempt to "do the right thing"
1725-
but as you use it you may uncover corner cases or unintuitive behavior. If you
1726-
do find something like this, do not hesitate to report the issue or ask on the
1727-
mailing list.
1782+
.. ipython:: python
1783+
1784+
dfmi.loc['A1',(slice(None),'foo')]
1785+
dfmi.loc[(slice(None),slice(None), ['C1','C3']),(slice(None),'foo')]
1786+
dfmi.loc[df[('a','foo')]>200,slice(None), ['C1','C3']),(slice(None),'foo')]
1787+
1788+
Furthermore you can *set* the values using these methods
1789+
1790+
.. ipython:: python
1791+
1792+
df2 = dfmi.copy()
1793+
df2.loc[(slice(None),slice(None), ['C1','C3']),:] = -10
1794+
df2
1795+
1796+
You use a right-hand-side of an alignable object as well.
1797+
1798+
.. ipython:: python
1799+
1800+
df2 = dfmi.copy()
1801+
df2.loc[(slice(None),slice(None), ['C1','C3']),:] = df2*1000
1802+
df2
17281803
17291804
.. _indexing.xs:
17301805
@@ -1738,6 +1813,11 @@ selecting data at a particular level of a MultiIndex easier.
17381813
17391814
df.xs('one', level='second')
17401815
1816+
.. ipython:: python
1817+
1818+
# using the slicers (new in 0.14.0)
1819+
df.loc[(slice(None),'one'),:]
1820+
17411821
You can also select on the columns with :meth:`~pandas.MultiIndex.xs`, by
17421822
providing the axis argument
17431823
@@ -1746,29 +1826,38 @@ providing the axis argument
17461826
df = df.T
17471827
df.xs('one', level='second', axis=1)
17481828
1829+
.. ipython:: python
1830+
1831+
# using the slicers (new in 0.14.0)
1832+
df.loc[:,(slice(None),'one')]
1833+
17491834
:meth:`~pandas.MultiIndex.xs` also allows selection with multiple keys
17501835
17511836
.. ipython:: python
17521837
17531838
df.xs(('one', 'bar'), level=('second', 'first'), axis=1)
17541839
1840+
.. ipython:: python
1841+
1842+
# using the slicers (new in 0.14.0)
1843+
df.loc[:,('bar','one')]
17551844
17561845
.. versionadded:: 0.13.0
17571846
17581847
You can pass ``drop_level=False`` to :meth:`~pandas.MultiIndex.xs` to retain
17591848
the level that was selected
17601849
1761-
.. ipython::
1850+
.. ipython:: python
17621851
17631852
df.xs('one', level='second', axis=1, drop_level=False)
17641853
17651854
versus the result with ``drop_level=True`` (the default value)
17661855
1767-
.. ipython::
1856+
.. ipython:: python
17681857
17691858
df.xs('one', level='second', axis=1, drop_level=True)
17701859
1771-
.. ipython::
1860+
.. ipython:: python
17721861
:suppress:
17731862
17741863
df = df.T

doc/source/release.rst

+1
Original file line numberDiff line numberDiff line change
@@ -74,6 +74,7 @@ Improvements to existing features
7474
the func (:issue:`6289`)
7575
- ``plot(legend='reverse')`` will now reverse the order of legend labels for most plot kinds.
7676
(:issue:`6014`)
77+
- Allow multi-index slicers (:issue:`6134`, :issue:`4036`, :issue:`3057`, :issue:`2598`, :issue:`5641`)
7778

7879
.. _release.bug_fixes-0.14.0:
7980

doc/source/v0.14.0.txt

+86
Original file line numberDiff line numberDiff line change
@@ -29,6 +29,92 @@ API changes
2929
df.iloc[:,2:3]
3030
df.iloc[:,1:3]
3131

32+
MultiIndexing Using Slicers
33+
~~~~~~~~~~~~~~~~~~~~~~~~~~~
34+
35+
In 0.14.0 we added a new way to slice multi-indexed objects.
36+
You can slice a multi-index by providing multiple indexers.
37+
38+
You can provide any of the selectors as if you are indexing by label, see :ref:`Selection by Label <indexing.label>`,
39+
including slices, lists of labels, labels, and boolean indexers.
40+
41+
You can use ``slice(None)`` to select all the contents of *that* level. You do not need to specify all the
42+
*deeper* levels, they will be implied as ``slice(None)``.
43+
44+
As usual, **both sides** of the slicers are included as this is label indexing.
45+
46+
See :ref:`the docs<indexing.mi_slicers>`
47+
See also issues (:issue:`6134`, :issue:`4036`, :issue:`3057`, :issue:`2598`, :issue:`5641`)
48+
49+
.. warning::
50+
51+
You should specify all axes in the ``.loc`` specifier, meaning the indexer for the **index** and
52+
for the **columns**. Their are some ambiguous cases where the passed indexer could be mis-interpreted
53+
as indexing *both* axes, rather than into say the MuliIndex for the rows.
54+
55+
You should do this:
56+
57+
.. code-block:: python
58+
59+
df.loc[(slice('A1','A3'),.....,:]
60+
61+
rather than this:
62+
63+
.. code-block:: python
64+
65+
df.loc[(slice('A1','A3'),.....]
66+
67+
.. warning::
68+
69+
You will need to make sure that the selection axes are fully lexsorted!
70+
71+
.. ipython:: python
72+
73+
def mklbl(prefix,n):
74+
return ["%s%s" % (prefix,i) for i in range(n)]
75+
76+
index = MultiIndex.from_product([mklbl('A',4),
77+
mklbl('B',2),
78+
mklbl('C',4),
79+
mklbl('D',2)])
80+
columns = MultiIndex.from_tuples([('a','foo'),('a','bar'),
81+
('b','foo'),('b','bah')],
82+
names=['lvl0', 'lvl1'])
83+
df = DataFrame(np.arange(len(index)*len(columns)).reshape((len(index),len(columns))),
84+
index=index,
85+
columns=columns).sortlevel().sortlevel(axis=1)
86+
df
87+
88+
.. ipython:: python
89+
90+
df.loc[(slice('A1','A3'),slice(None), ['C1','C3']),:]
91+
df.loc[(slice(None),slice(None), ['C1','C3']),:]
92+
93+
It is possible to perform quite complicated selections using this method on multiple
94+
axes at the same time.
95+
96+
.. ipython:: python
97+
98+
df.loc['A1',(slice(None),'foo')]
99+
df.loc[(slice(None),slice(None), ['C1','C3']),(slice(None),'foo')]
100+
df.loc[df[('a','foo')]>200,slice(None), ['C1','C3']),(slice(None),'foo')]
101+
102+
Furthermore you can *set* the values using these methods
103+
104+
.. ipython:: python
105+
106+
df2 = df.copy()
107+
df2.loc[(slice(None),slice(None), ['C1','C3']),:] = -10
108+
df2
109+
110+
You use a right-hand-side of an alignable object as well.
111+
112+
.. ipython:: python
113+
114+
df2 = df.copy()
115+
df2.loc[(slice(None),slice(None), ['C1','C3']),:] = df2*1000
116+
df2
117+
32118
Prior Version Deprecations/Changes
33119
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
34120

0 commit comments

Comments
 (0)