@@ -296,10 +296,6 @@ As usual, **both sides** of the slicers are included as this is label indexing.
296
296
297
297
df.loc[(slice (' A1' ,' A3' ),... ..)]
298
298
299
- .. warning ::
300
-
301
- You will need to make sure that the selection axes are fully lexsorted!
302
-
303
299
.. ipython :: python
304
300
305
301
def mklbl (prefix ,n ):
@@ -477,31 +473,24 @@ allowing you to permute the hierarchical index levels in one step:
477
473
478
474
df[:5 ].reorder_levels([1 ,0 ], axis = 0 )
479
475
480
- The need for sortedness with :class: `~pandas.MultiIndex `
481
- --------------------------------------------------------
476
+ Sorting a :class: `~pandas.MultiIndex `
477
+ -------------------------------------
482
478
483
- **Caveat emptor **: the present implementation of ``MultiIndex `` requires that
484
- the labels be sorted for some of the slicing / indexing routines to work
485
- correctly. You can think about breaking the axis into unique groups, where at
486
- the hierarchical level of interest, each distinct group shares a label, but no
487
- two have the same label. However, the ``MultiIndex `` does not enforce this:
488
- **you are responsible for ensuring that things are properly sorted **. There is
489
- an important new method ``sort_index `` to sort an axis within a ``MultiIndex ``
490
- so that its labels are grouped and sorted by the original ordering of the
491
- associated factor at that level. Note that this does not necessarily mean the
492
- labels will be sorted lexicographically!
479
+ For MultiIndex-ed objects to be indexed & sliced effectively, they need
480
+ to be sorted. As with any index, you can use ``sort_index ``.
493
481
494
482
.. ipython :: python
495
483
496
484
import random; random.shuffle(tuples)
497
485
s = pd.Series(np.random.randn(8 ), index = pd.MultiIndex.from_tuples(tuples))
498
486
s
487
+ s.sort_index()
499
488
s.sort_index(level = 0 )
500
489
s.sort_index(level = 1 )
501
490
502
491
.. _advanced.sortlevel_byname :
503
492
504
- Note, you may also pass a level name to ``sort_index `` if the MultiIndex levels
493
+ You may also pass a level name to ``sort_index `` if the MultiIndex levels
505
494
are named.
506
495
507
496
.. ipython :: python
@@ -510,46 +499,48 @@ are named.
510
499
s.sort_index(level = ' L1' )
511
500
s.sort_index(level = ' L2' )
512
501
513
- Some indexing will work even if the data are not sorted, but will be rather
514
- inefficient and will also return a copy of the data rather than a view:
515
-
516
- .. ipython :: python
517
-
518
- s[' qux' ]
519
- s.sort_index(level = 1 )[' qux' ]
520
-
521
502
On higher dimensional objects, you can sort any of the other axes by level if
522
503
they have a MultiIndex:
523
504
524
505
.. ipython :: python
525
506
526
507
df.T.sort_index(level = 1 , axis = 1 )
527
508
528
- The `` MultiIndex `` object has code to ** explicitly check the sort depth **. Thus,
529
- if you try to index at a depth at which the index is not sorted, it will raise
530
- an exception. Here is a concrete example to illustrate this :
509
+ Indexing will work even if the data are not sorted, but will be rather
510
+ inefficient (and show a `` PerformanceWarning ``). It will also
511
+ return a copy of the data rather than a view :
531
512
532
513
.. ipython :: python
533
514
534
- tuples = [(' a' , ' a' ), (' a' , ' b' ), (' b' , ' a' ), (' b' , ' b' )]
535
- idx = pd.MultiIndex.from_tuples(tuples)
536
- idx.lexsort_depth
515
+ dfm = pd.DataFrame({' jim' : [0 , 0 , 1 , 1 ],
516
+ ' joe' : [' x' , ' x' , ' z' , ' y' ],
517
+ ' jolie' : np.random.rand(4 )})
518
+ dfm = dfm.set_index([' jim' , ' joe' ])
519
+ dfm
520
+
521
+ .. code-block :: ipython
522
+
523
+ In [4]: dfm.loc[(1, 'z')]
524
+ PerformanceWarning: indexing past lexsort depth may impact performance.
537
525
538
- reordered = idx[[1 , 0 , 3 , 2 ]]
539
- reordered.lexsort_depth
526
+ Out[4]:
527
+ jolie
528
+ jim joe
529
+ 1 z 0.64094
540
530
541
- s = pd.Series(np.random.randn(4 ), index = reordered)
542
- s.ix[' a' :' a' ]
531
+ The ``is_lexsorted() `` method on an ``Index `` show if the index is sorted, and the ``lexsort_depth `` property returns the sort depth:
543
532
544
- However:
533
+ .. ipython :: python
545
534
546
- ::
535
+ dfm.index.is_lexsorted()
536
+ dfm.index.lexsort_depth
547
537
548
- >>> s.ix[('a', 'b'):('b', 'a')]
549
- Traceback (most recent call last)
550
- ...
551
- KeyError: Key length (3) was greater than MultiIndex lexsort depth (2)
538
+ .. ipython :: python
552
539
540
+ dfm = dfm.sort_index()
541
+ dfm
542
+ dfm.index.is_lexsorted()
543
+ dfm.index.lexsort_depth
553
544
554
545
Take Methods
555
546
------------
0 commit comments