Skip to content

Commit 786fed7

Browse files
committed
DOC: gotcha docs re: #656
1 parent c1372bc commit 786fed7

File tree

3 files changed

+59
-3
lines changed

3 files changed

+59
-3
lines changed

TODO.rst

+3-2
Original file line numberDiff line numberDiff line change
@@ -1,7 +1,8 @@
11
DOCS 0.7.0
22
----------
3-
- no sort in groupby
4-
- concat with dict
3+
- ??? no sort in groupby
4+
- DONE concat with dict
5+
- Gotchas re: integer indexing
56

67
DONE
78
----

doc/source/conf.py

+1-1
Original file line numberDiff line numberDiff line change
@@ -61,7 +61,7 @@
6161

6262
# General information about the project.
6363
project = u'pandas'
64-
copyright = u'2008-2011, AQR and Wes McKinney'
64+
copyright = u'2008-2011, the pandas development team'
6565

6666
# The version info for the project you're documenting, acts as replacement for
6767
# |version| and |release|, also used in various other places throughout the

doc/source/gotchas.rst

+55
Original file line numberDiff line numberDiff line change
@@ -80,3 +80,58 @@ specific dates. To enable this, we made the design design to make label-based sl
8080
This is most definitely a "practicality beats purity" sort of thing, but it is
8181
something to watch out for is you expect label-based slicing to behave exactly
8282
in the way that standard Python integer slicing works.
83+
84+
Miscellaneous indexing gotchas
85+
------------------------------
86+
87+
Reindex versus ix gotchas
88+
~~~~~~~~~~~~~~~~~~~~~~~~~
89+
90+
Many users will find themselves using the ``ix`` indexing capabilities as a
91+
concise means of selecting data from a pandas object:
92+
93+
.. ipython:: python
94+
95+
df = DataFrame(randn(6, 4), columns=['one', 'two', 'three', 'four'],
96+
index=list('abcdef'))
97+
df
98+
df.ix[['b', 'c', 'e']]
99+
100+
This is, of course, completely equivalent *in this case* to using th
101+
``reindex`` method:
102+
103+
.. ipython:: python
104+
105+
df.reindex(['b', 'c', 'e'])
106+
107+
Some might conclude that ``ix`` and ``reindex`` are 100% equivalent based on
108+
this. This is indeed true **except in the case of integer indexing**. For
109+
example, the above operation could alternately have been expressed as:
110+
111+
.. ipython:: python
112+
113+
df.ix[[1, 2, 4]]
114+
115+
If you pass ``[1, 2, 4]`` to ``reindex`` you will get another thing entirely:
116+
117+
.. ipython:: python
118+
119+
df.reindex([1, 2, 4])
120+
121+
So it's important to remember that ``reindex`` is **strict label indexing
122+
only**. This can lead to some potentially surprising results in pathological
123+
cases where an index contains, say, both integers and strings:
124+
125+
.. ipython:: python
126+
127+
s = Series([1, 2, 3], index=['a', 0, 1])
128+
s
129+
s.ix[[0, 1]]
130+
s.reindex([0, 1])
131+
132+
Because the index in this case does not contain solely integers, ``ix`` falls
133+
back on integer indexing. By contrast, ``reindex`` only looks for the values
134+
passed in the index, thus finding the integers ``0`` and ``1``. While it would
135+
be possible to insert some logic to check whether a passed sequence is all
136+
contained in the index, that logic would exact a very high cost in large data
137+
sets.

0 commit comments

Comments
 (0)