Skip to content

Commit e06d7a8

Browse files
committed
DOC: docs to add detail on gotchas for true value testing
1 parent e47e981 commit e06d7a8

File tree

4 files changed

+119
-5
lines changed

4 files changed

+119
-5
lines changed

doc/source/10min.rst

+17-1
Original file line numberDiff line numberDiff line change
@@ -269,7 +269,6 @@ A ``where`` operation for getting.
269269
270270
df[df > 0]
271271
272-
273272
Setting
274273
~~~~~~~
275274

@@ -708,3 +707,20 @@ Reading from an excel file
708707
:suppress:
709708
710709
os.remove('foo.xlsx')
710+
711+
Gotchas
712+
-------
713+
714+
If you are trying an operation and you see an exception like:
715+
716+
.. code-block:: python
717+
718+
>>> if pd.Series([False, True, False]):
719+
print("I was true")
720+
Traceback
721+
...
722+
ValueError: The truth value of an array is ambiguous. Use a.empty, a.any() or a.all().
723+
724+
See :ref:`Comparisons<basics.compare>` for an explanation and what to do.
725+
726+
See :ref:`Gotachas<gotchas>` as well.

doc/source/basics.rst

+48-2
Original file line numberDiff line numberDiff line change
@@ -8,7 +8,7 @@
88
from pandas import *
99
randn = np.random.randn
1010
np.set_printoptions(precision=4, suppress=True)
11-
from pandas.compat import lrange
11+
from pandas.compat import lrange
1212
1313
==============================
1414
Essential Basic Functionality
@@ -198,16 +198,62 @@ replace NaN with some other value using ``fillna`` if you wish).
198198
199199
Flexible Comparisons
200200
~~~~~~~~~~~~~~~~~~~~
201+
202+
.. _basics.compare:
203+
201204
Starting in v0.8, pandas introduced binary comparison methods eq, ne, lt, gt,
202205
le, and ge to Series and DataFrame whose behavior is analogous to the binary
203206
arithmetic operations described above:
204207

205208
.. ipython:: python
206209
207210
df.gt(df2)
208-
209211
df2.ne(df)
210212
213+
These operations produce a pandas object the same type as the left-hand-side input
214+
that if of dtype ``bool``. These ``boolean`` objects can be used in indexing operations,
215+
see :ref:`here<indexing.boolean>`
216+
217+
Furthermore, you can apply the reduction functions: ``any()`` and ``all()`` to provide a
218+
way to summarize these results.
219+
220+
.. ipython:: python
221+
222+
(df>0).all()
223+
(df>0).any()
224+
225+
Finally you can test if a pandas object is empty, via the ``empty`` property.
226+
227+
.. ipython:: python
228+
229+
df.empty
230+
DataFrame(columns=list('ABC')).empty
231+
232+
.. warning::
233+
234+
You might be tempted to do the following:
235+
236+
.. code-block:: python
237+
238+
>>>if df:
239+
...
240+
241+
Or
242+
243+
.. code-block:: python
244+
245+
>>> df and df2
246+
247+
These both will raise as you are trying to compare multiple values.
248+
249+
.. code-block:: python
250+
251+
ValueError: The truth value of an array is ambiguous. Use a.empty, a.any() or a.all().
252+
253+
254+
See :ref:`gotchas<gotchas.truth>` for a more detailed discussion.
255+
256+
211257
Combining overlapping data sets
212258
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
213259

doc/source/gotchas.rst

+53-1
Original file line numberDiff line numberDiff line change
@@ -15,6 +15,58 @@
1515
Caveats and Gotchas
1616
*******************
1717

18+
Using If/Truth Statements with Pandas
19+
-------------------------------------
20+
21+
.. _gotchas.truth:
22+
23+
Pandas follows the numpy convention of raising an error when you try to convert something to a ``bool``.
24+
This happens in a ``if`` or when using the boolean operations, ``and``, ``or``, or ``not``. It is not clear
25+
what the result of
26+
27+
.. code-block:: python
28+
29+
>>> if Series([False, True, False]):
30+
...
31+
32+
should be. Should it be ``True`` because it's not zero-length? ``False`` because there are ``False`` values?
33+
It is unclear, so instead, pandas raises a ``ValueError``:
34+
35+
.. code-block:: python
36+
37+
>>> if pd.Series([False, True, False]):
38+
print("I was true")
39+
Traceback
40+
...
41+
ValueError: The truth value of an array is ambiguous. Use a.empty, a.any() or a.all().
42+
43+
44+
If you see that, you need to explicitly choose what you want to do with it (e.g., use `any()`, `all()` or `empty`).
45+
or, you might want to compare if the pandas object is ``None``
46+
47+
.. code-block:: python
48+
49+
>>> if pd.Series([False, True, False]) is not None:
50+
print("I was not None")
51+
>>> I was not None
52+
53+
Bitwise boolean
54+
~~~~~~~~~~~~~~~
55+
56+
Bitwise boolean operators like ``==`` and ``!=`` will return a boolean ``Series``,
57+
which is almost always what you want anyways.
58+
59+
.. code-block:: python
60+
61+
>>> s = pd.Series(range(5))
62+
>>> s == 4
63+
0 False
64+
1 False
65+
2 False
66+
3 False
67+
4 True
68+
dtype: bool
69+
1870
``NaN``, Integer ``NA`` values and ``NA`` type promotions
1971
---------------------------------------------------------
2072

@@ -428,7 +480,7 @@ parse HTML tables in the top-level pandas io function ``read_html``.
428480
lxml will work correctly:
429481

430482
.. code-block:: sh
431-
483+
432484
# remove the included version
433485
conda remove lxml
434486

pandas/core/generic.py

+1-1
Original file line numberDiff line numberDiff line change
@@ -531,7 +531,7 @@ def empty(self):
531531
return not all(len(self._get_axis(a)) > 0 for a in self._AXIS_ORDERS)
532532

533533
def __nonzero__(self):
534-
raise ValueError("The truth value of an array with more than one element is ambiguous. Use a.any() or a.all()")
534+
raise ValueError("The truth value of an array is ambiguous. Use a.empty, a.any() or a.all().")
535535

536536
__bool__ = __nonzero__
537537

0 commit comments

Comments
 (0)