Skip to content

Commit ce61b3f

Browse files
rkernjreback
authored andcommitted
ENH: Fine-grained errstate handling
closes #13109 closes #13135 The precise strategy to be taken here is open for discussion. I tried to be reasonably fine-grained rather than slap a generic decorator over everything because it's easier to go that direction than the reverse. The `errstate()` blocks in the tests were added *after* fixing all of the library code. Unfortunately, these are less fine-grained than I would like because some of the tests have many lines of the form `assert_array_equal(pandas_expression_to_test, expected_raw_numpy_expression)` where `expected_raw_numpy_expression` is what is triggering the warning. It was tedious to try to rewrite all of that to wrap just `expected_raw_numpy_expression`. I think I got everything exercised by the test suite except for parts of the test suite that are skipped on my machine due to dependencies. We'll see how things go in the CI. I haven't added any new tests yet. Could do if requested. Author: Robert Kern <[email protected]> Author: Robert Kern <[email protected]> Closes #13145 from rkern/fix/errstate and squashes the following commits: ef9c001 [Robert Kern] BUG: whoops, wrong function. 7fd2e86 [Robert Kern] ENH: More whatsnew documentation. 44805db [Robert Kern] ENH: Rearrange expression to avoid generating a warning that would need to be silenced. 1fe1bc2 [Robert Kern] pep8 bf1f662 [Robert Kern] BUG: New fixes after master rebase. e7adc03 [Robert Kern] BUG: wrong function. a59cfa7 [Robert Kern] ENH: Avoiding the bounds error is better than silencing the warning. 0e1ea81 [Robert Kern] BUG: A few more stragglers. 863ac93 [Robert Kern] TST: Add a new test to ensure that boolean comparisons are errstate-protected. 6932851 [Robert Kern] TST: Basic check that the global errstate remains unchanged. c9df7b3 [Robert Kern] BUG: removed debugging print 3b12f08 [Robert Kern] ENH: Silence numpy warnings from certain expressions computed during tests. eca512c [Robert Kern] BUG: Handle NaT explicitly. 6fbc9ce [Robert Kern] BUG: First pass at fine-grained errstate.
1 parent 51b20de commit ce61b3f

35 files changed

+449
-314
lines changed

doc/source/whatsnew/v0.19.0.txt

+13
Original file line numberDiff line numberDiff line change
@@ -7,6 +7,10 @@ This is a major release from 0.18.1 and includes a small number of API changes,
77
enhancements, and performance improvements along with a large number of bug fixes. We recommend that all
88
users upgrade to this version.
99

10+
.. warning::
11+
12+
pandas >= 0.19.0 will no longer silence numpy ufunc warnings upon import, see :ref:`here <whatsnew_0190.errstate>`. (:issue:`13109`, :issue:`13145`)
13+
1014
Highlights include:
1115

1216
- :func:`merge_asof` for asof-style time-series joining, see :ref:`here <whatsnew_0190.enhancements.asof_merge>`
@@ -357,6 +361,15 @@ Google BigQuery Enhancements
357361
^^^^^^^^^^^^^^^^^^^^^^^^^^^^
358362
- The :func:`pandas.io.gbq.read_gbq` method has gained the ``dialect`` argument to allow users to specify whether to use BigQuery's legacy SQL or BigQuery's standard SQL. See the :ref:`docs <io.bigquery_reader>` for more details (:issue:`13615`).
359363

364+
.. _whatsnew_0190.errstate:
365+
366+
Fine-grained numpy errstate
367+
^^^^^^^^^^^^^^^^^^^^^^^^^^^
368+
369+
Previous versions of pandas would permanently silence numpy's ufunc error handling when ``pandas`` was imported (:issue:`13109`). Pandas did this in order to silence the warnings that would arise from using numpy ufuncs on missing data, which are usually represented as NaNs. Unfortunately, this silenced legitimate warnings arising in non-pandas code in the application. Starting with 0.19.0, pandas will use the ``numpy.errstate`` context manager to silence these warnings in a more fine-grained manner only around where these operations are actually used in the pandas codebase.
370+
371+
After upgrading pandas, you may see "new" ``RuntimeWarnings`` being issued from your code. These are likely legitimate, and the underlying cause likely existed in the code when using previous versions of pandas that simply silenced the warning. Use `numpy.errstate <http://docs.scipy.org/doc/numpy/reference/generated/numpy.errstate.html>`__ around the source of the ``RuntimeWarning`` to control how these conditions are handled.
372+
360373
.. _whatsnew_0190.enhancements.other:
361374

362375
Other enhancements

pandas/compat/numpy/__init__.py

-2
Original file line numberDiff line numberDiff line change
@@ -5,8 +5,6 @@
55
from distutils.version import LooseVersion
66
from pandas.compat import string_types, string_and_binary_types
77

8-
# turn off all numpy warnings
9-
np.seterr(all='ignore')
108

119
# numpy versioning
1210
_np_version = np.version.short_version

pandas/computation/align.py

+1-1
Original file line numberDiff line numberDiff line change
@@ -95,7 +95,7 @@ def _align_core(terms):
9595
term_axis_size = len(ti.axes[axis])
9696
reindexer_size = len(reindexer)
9797

98-
ordm = np.log10(abs(reindexer_size - term_axis_size))
98+
ordm = np.log10(max(1, abs(reindexer_size - term_axis_size)))
9999
if ordm >= 1 and reindexer_size >= 10000:
100100
warnings.warn('Alignment difference on axis {0} is larger '
101101
'than an order of magnitude on term {1!r}, '

pandas/computation/expressions.py

+2-1
Original file line numberDiff line numberDiff line change
@@ -59,7 +59,8 @@ def _evaluate_standard(op, op_str, a, b, raise_on_error=True, **eval_kwargs):
5959
""" standard evaluation """
6060
if _TEST_MODE:
6161
_store_test_result(False)
62-
return op(a, b)
62+
with np.errstate(all='ignore'):
63+
return op(a, b)
6364

6465

6566
def _can_use_numexpr(op, op_str, a, b, dtype_check):

pandas/computation/ops.py

+2-1
Original file line numberDiff line numberDiff line change
@@ -523,7 +523,8 @@ def __init__(self, func, args):
523523

524524
def __call__(self, env):
525525
operands = [op(env) for op in self.operands]
526-
return self.func.func(*operands)
526+
with np.errstate(all='ignore'):
527+
return self.func.func(*operands)
527528

528529
def __unicode__(self):
529530
operands = map(str, self.operands)

pandas/computation/tests/test_eval.py

+4-2
Original file line numberDiff line numberDiff line change
@@ -1613,7 +1613,8 @@ def test_unary_functions(self):
16131613
for fn in self.unary_fns:
16141614
expr = "{0}(a)".format(fn)
16151615
got = self.eval(expr)
1616-
expect = getattr(np, fn)(a)
1616+
with np.errstate(all='ignore'):
1617+
expect = getattr(np, fn)(a)
16171618
tm.assert_series_equal(got, expect, check_names=False)
16181619

16191620
def test_binary_functions(self):
@@ -1624,7 +1625,8 @@ def test_binary_functions(self):
16241625
for fn in self.binary_fns:
16251626
expr = "{0}(a, b)".format(fn)
16261627
got = self.eval(expr)
1627-
expect = getattr(np, fn)(a, b)
1628+
with np.errstate(all='ignore'):
1629+
expect = getattr(np, fn)(a, b)
16281630
tm.assert_almost_equal(got, expect, check_names=False)
16291631

16301632
def test_df_use_case(self):

pandas/core/frame.py

+6-3
Original file line numberDiff line numberDiff line change
@@ -3810,7 +3810,8 @@ def update(self, other, join='left', overwrite=True, filter_func=None,
38103810
this = self[col].values
38113811
that = other[col].values
38123812
if filter_func is not None:
3813-
mask = ~filter_func(this) | isnull(that)
3813+
with np.errstate(all='ignore'):
3814+
mask = ~filter_func(this) | isnull(that)
38143815
else:
38153816
if raise_conflict:
38163817
mask_this = notnull(that)
@@ -4105,7 +4106,8 @@ def f(x):
41054106
return self._apply_empty_result(func, axis, reduce, *args, **kwds)
41064107

41074108
if isinstance(f, np.ufunc):
4108-
results = f(self.values)
4109+
with np.errstate(all='ignore'):
4110+
results = f(self.values)
41094111
return self._constructor(data=results, index=self.index,
41104112
columns=self.columns, copy=False)
41114113
else:
@@ -4931,7 +4933,8 @@ def f(x):
49314933
"type %s not implemented." %
49324934
filter_type)
49334935
raise_with_traceback(e)
4934-
result = f(data.values)
4936+
with np.errstate(all='ignore'):
4937+
result = f(data.values)
49354938
labels = data._get_agg_axis(axis)
49364939
else:
49374940
if numeric_only:

pandas/core/groupby.py

+9-3
Original file line numberDiff line numberDiff line change
@@ -678,7 +678,8 @@ def apply(self, func, *args, **kwargs):
678678

679679
@wraps(func)
680680
def f(g):
681-
return func(g, *args, **kwargs)
681+
with np.errstate(all='ignore'):
682+
return func(g, *args, **kwargs)
682683
else:
683684
raise ValueError('func must be a callable if args or '
684685
'kwargs are supplied')
@@ -4126,7 +4127,10 @@ def loop(labels, shape):
41264127
out = stride * labels[0].astype('i8', subok=False, copy=False)
41274128

41284129
for i in range(1, nlev):
4129-
stride //= shape[i]
4130+
if shape[i] == 0:
4131+
stride = 0
4132+
else:
4133+
stride //= shape[i]
41304134
out += labels[i] * stride
41314135

41324136
if xnull: # exclude nulls
@@ -4365,7 +4369,9 @@ def _get_group_index_sorter(group_index, ngroups):
43654369
count = len(group_index)
43664370
alpha = 0.0 # taking complexities literally; there may be
43674371
beta = 1.0 # some room for fine-tuning these parameters
4368-
if alpha + beta * ngroups < count * np.log(count):
4372+
do_groupsort = (count > 0 and ((alpha + beta * ngroups) <
4373+
(count * np.log(count))))
4374+
if do_groupsort:
43694375
sorter, _ = _algos.groupsort_indexer(_ensure_int64(group_index),
43704376
ngroups)
43714377
return _ensure_platform_int(sorter)

pandas/core/internals.py

+4-2
Original file line numberDiff line numberDiff line change
@@ -348,7 +348,8 @@ def apply(self, func, mgr=None, **kwargs):
348348
""" apply the function to my values; return a block if we are not
349349
one
350350
"""
351-
result = func(self.values, **kwargs)
351+
with np.errstate(all='ignore'):
352+
result = func(self.values, **kwargs)
352353
if not isinstance(result, Block):
353354
result = self.make_block(values=_block_shape(result,
354355
ndim=self.ndim))
@@ -1156,7 +1157,8 @@ def handle_error():
11561157

11571158
# get the result
11581159
try:
1159-
result = get_result(other)
1160+
with np.errstate(all='ignore'):
1161+
result = get_result(other)
11601162

11611163
# if we have an invalid shape/broadcast error
11621164
# GH4576, so raise instead of allowing to pass through

pandas/core/nanops.py

+15-9
Original file line numberDiff line numberDiff line change
@@ -45,7 +45,8 @@ def _f(*args, **kwargs):
4545
'this dtype'.format(
4646
f.__name__.replace('nan', '')))
4747
try:
48-
return f(*args, **kwargs)
48+
with np.errstate(invalid='ignore'):
49+
return f(*args, **kwargs)
4950
except ValueError as e:
5051
# we want to transform an object array
5152
# ValueError message to the more typical TypeError
@@ -513,7 +514,8 @@ def nanskew(values, axis=None, skipna=True):
513514
m2 = _zero_out_fperr(m2)
514515
m3 = _zero_out_fperr(m3)
515516

516-
result = (count * (count - 1) ** 0.5 / (count - 2)) * (m3 / m2 ** 1.5)
517+
with np.errstate(invalid='ignore', divide='ignore'):
518+
result = (count * (count - 1) ** 0.5 / (count - 2)) * (m3 / m2 ** 1.5)
517519

518520
dtype = values.dtype
519521
if is_float_dtype(dtype):
@@ -562,10 +564,11 @@ def nankurt(values, axis=None, skipna=True):
562564
m2 = adjusted2.sum(axis, dtype=np.float64)
563565
m4 = adjusted4.sum(axis, dtype=np.float64)
564566

565-
adj = 3 * (count - 1) ** 2 / ((count - 2) * (count - 3))
566-
numer = count * (count + 1) * (count - 1) * m4
567-
denom = (count - 2) * (count - 3) * m2**2
568-
result = numer / denom - adj
567+
with np.errstate(invalid='ignore', divide='ignore'):
568+
adj = 3 * (count - 1) ** 2 / ((count - 2) * (count - 3))
569+
numer = count * (count + 1) * (count - 1) * m4
570+
denom = (count - 2) * (count - 3) * m2**2
571+
result = numer / denom - adj
569572

570573
# floating point error
571574
numer = _zero_out_fperr(numer)
@@ -579,7 +582,8 @@ def nankurt(values, axis=None, skipna=True):
579582
if denom == 0:
580583
return 0
581584

582-
result = numer / denom - adj
585+
with np.errstate(invalid='ignore', divide='ignore'):
586+
result = numer / denom - adj
583587

584588
dtype = values.dtype
585589
if is_float_dtype(dtype):
@@ -658,7 +662,8 @@ def _maybe_null_out(result, axis, mask):
658662

659663
def _zero_out_fperr(arg):
660664
if isinstance(arg, np.ndarray):
661-
return np.where(np.abs(arg) < 1e-14, 0, arg)
665+
with np.errstate(invalid='ignore'):
666+
return np.where(np.abs(arg) < 1e-14, 0, arg)
662667
else:
663668
return arg.dtype.type(0) if np.abs(arg) < 1e-14 else arg
664669

@@ -760,7 +765,8 @@ def f(x, y):
760765
ymask = isnull(y)
761766
mask = xmask | ymask
762767

763-
result = op(x, y)
768+
with np.errstate(all='ignore'):
769+
result = op(x, y)
764770

765771
if mask.any():
766772
if is_bool_dtype(result):

pandas/core/ops.py

+12-6
Original file line numberDiff line numberDiff line change
@@ -636,7 +636,8 @@ def na_op(x, y):
636636

637637
def safe_na_op(lvalues, rvalues):
638638
try:
639-
return na_op(lvalues, rvalues)
639+
with np.errstate(all='ignore'):
640+
return na_op(lvalues, rvalues)
640641
except Exception:
641642
if isinstance(rvalues, ABCSeries):
642643
if is_object_dtype(rvalues):
@@ -743,7 +744,8 @@ def na_op(x, y):
743744
x = x.view('i8')
744745

745746
try:
746-
result = getattr(x, name)(y)
747+
with np.errstate(all='ignore'):
748+
result = getattr(x, name)(y)
747749
if result is NotImplemented:
748750
raise TypeError("invalid type comparison")
749751
except AttributeError:
@@ -796,13 +798,15 @@ def wrapper(self, other, axis=None):
796798
# which would then not take categories ordering into account
797799
# we can go directly to op, as the na_op would just test again and
798800
# dispatch to it.
799-
res = op(self.values, other)
801+
with np.errstate(all='ignore'):
802+
res = op(self.values, other)
800803
else:
801804
values = self.get_values()
802805
if isinstance(other, (list, np.ndarray)):
803806
other = np.asarray(other)
804807

805-
res = na_op(values, other)
808+
with np.errstate(all='ignore'):
809+
res = na_op(values, other)
806810
if isscalar(res):
807811
raise TypeError('Could not compare %s type with Series' %
808812
type(other))
@@ -1096,13 +1100,15 @@ def na_op(x, y):
10961100
xrav = xrav[mask]
10971101
yrav = yrav[mask]
10981102
if np.prod(xrav.shape) and np.prod(yrav.shape):
1099-
result[mask] = op(xrav, yrav)
1103+
with np.errstate(all='ignore'):
1104+
result[mask] = op(xrav, yrav)
11001105
elif hasattr(x, 'size'):
11011106
result = np.empty(x.size, dtype=x.dtype)
11021107
mask = notnull(xrav)
11031108
xrav = xrav[mask]
11041109
if np.prod(xrav.shape):
1105-
result[mask] = op(xrav, y)
1110+
with np.errstate(all='ignore'):
1111+
result[mask] = op(xrav, y)
11061112
else:
11071113
raise TypeError("cannot perform operation {op} between "
11081114
"objects of type {x} and {y}".format(

pandas/core/panel.py

+17-12
Original file line numberDiff line numberDiff line change
@@ -713,7 +713,8 @@ def _combine(self, other, func, axis=0):
713713
(str(type(other)), str(type(self))))
714714

715715
def _combine_const(self, other, func):
716-
new_values = func(self.values, other)
716+
with np.errstate(all='ignore'):
717+
new_values = func(self.values, other)
717718
d = self._construct_axes_dict()
718719
return self._constructor(new_values, **d)
719720

@@ -723,14 +724,15 @@ def _combine_frame(self, other, func, axis=0):
723724

724725
other = other.reindex(index=index, columns=columns)
725726

726-
if axis == 0:
727-
new_values = func(self.values, other.values)
728-
elif axis == 1:
729-
new_values = func(self.values.swapaxes(0, 1), other.values.T)
730-
new_values = new_values.swapaxes(0, 1)
731-
elif axis == 2:
732-
new_values = func(self.values.swapaxes(0, 2), other.values)
733-
new_values = new_values.swapaxes(0, 2)
727+
with np.errstate(all='ignore'):
728+
if axis == 0:
729+
new_values = func(self.values, other.values)
730+
elif axis == 1:
731+
new_values = func(self.values.swapaxes(0, 1), other.values.T)
732+
new_values = new_values.swapaxes(0, 1)
733+
elif axis == 2:
734+
new_values = func(self.values.swapaxes(0, 2), other.values)
735+
new_values = new_values.swapaxes(0, 2)
734736

735737
return self._constructor(new_values, self.items, self.major_axis,
736738
self.minor_axis)
@@ -744,7 +746,8 @@ def _combine_panel(self, other, func):
744746
this = self.reindex(items=items, major=major, minor=minor)
745747
other = other.reindex(items=items, major=major, minor=minor)
746748

747-
result_values = func(this.values, other.values)
749+
with np.errstate(all='ignore'):
750+
result_values = func(this.values, other.values)
748751

749752
return self._constructor(result_values, items, major, minor)
750753

@@ -1011,7 +1014,8 @@ def apply(self, func, axis='major', **kwargs):
10111014
# try ufunc like
10121015
if isinstance(f, np.ufunc):
10131016
try:
1014-
result = np.apply_along_axis(func, axis, self.values)
1017+
with np.errstate(all='ignore'):
1018+
result = np.apply_along_axis(func, axis, self.values)
10151019
return self._wrap_result(result, axis=axis)
10161020
except (AttributeError):
10171021
pass
@@ -1113,7 +1117,8 @@ def _reduce(self, op, name, axis=0, skipna=True, numeric_only=None,
11131117
axis_number = self._get_axis_number(axis_name)
11141118
f = lambda x: op(x, axis=axis_number, skipna=skipna, **kwds)
11151119

1116-
result = f(self.values)
1120+
with np.errstate(all='ignore'):
1121+
result = f(self.values)
11171122

11181123
axes = self._get_plane_axes(axis_name)
11191124
if result.ndim == 2 and axis_name != self._info_axis_name:

0 commit comments

Comments
 (0)