-
-
Notifications
You must be signed in to change notification settings - Fork 18.4k
failing test_moments after centered moving window introduced #2490
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
Do you have the failing output handy? I'll take a look, thanks. On Tue, Dec 11, 2012 at 10:16 AM, y-p [email protected] wrote:
|
======================================================================
FAIL: test_rolling_count (pandas.stats.tests.test_moments.TestMoments)
----------------------------------------------------------------------
Traceback (most recent call last):
File "/home/user1/src/pandas/pandas/stats/tests/test_moments.py", line 46, in test_rolling_count
fill_value=0)
File "/home/user1/src/pandas/pandas/stats/tests/test_moments.py", line 320, in _check_moment_func
fill_value=fill_value)
File "/home/user1/src/pandas/pandas/stats/tests/test_moments.py", line 380, in _check_ndarray
assert_almost_equal(result[1], expected[10])
File "/home/user1/src/pandas/pandas/util/testing.py", line 124, in assert_almost_equal
1, a / b, decimal=5, err_msg=err_msg(a, b), verbose=False)
File "/usr/local/lib/python2.7/dist-packages/numpy/testing/utils.py", line 468, in assert_almost_equal
raise AssertionError(msg)
AssertionError:
Arrays are not almost equal to 5 decimals expected 13.00000 but got 1.00000
======================================================================
FAIL: test_rolling_mean (pandas.stats.tests.test_moments.TestMoments)
----------------------------------------------------------------------
Traceback (most recent call last):
File "/home/user1/src/pandas/pandas/stats/tests/test_moments.py", line 49, in test_rolling_mean
self._check_moment_func(mom.rolling_mean, np.mean)
File "/home/user1/src/pandas/pandas/stats/tests/test_moments.py", line 320, in _check_moment_func
fill_value=fill_value)
File "/home/user1/src/pandas/pandas/stats/tests/test_moments.py", line 387, in _check_ndarray
self.assert_(np.isnan(result[14]))
AssertionError: False is not true
======================================================================
FAIL: test_rolling_min (pandas.stats.tests.test_moments.TestMoments)
----------------------------------------------------------------------
Traceback (most recent call last):
File "/home/user1/src/pandas/pandas/stats/tests/test_moments.py", line 173, in test_rolling_min
self._check_moment_func(mom.rolling_min, np.min)
File "/home/user1/src/pandas/pandas/stats/tests/test_moments.py", line 320, in _check_moment_func
fill_value=fill_value)
File "/home/user1/src/pandas/pandas/stats/tests/test_moments.py", line 387, in _check_ndarray
self.assert_(np.isnan(result[14]))
AssertionError: False is not true
======================================================================
FAIL: test_rolling_std (pandas.stats.tests.test_moments.TestMoments)
----------------------------------------------------------------------
Traceback (most recent call last):
File "/home/user1/src/pandas/pandas/stats/tests/test_moments.py", line 235, in test_rolling_std
lambda x: np.std(x, ddof=1))
File "/home/user1/src/pandas/pandas/stats/tests/test_moments.py", line 320, in _check_moment_func
fill_value=fill_value)
File "/home/user1/src/pandas/pandas/stats/tests/test_moments.py", line 387, in _check_ndarray
self.assert_(np.isnan(result[14]))
AssertionError: False is not true
---------------------------------------------------------------------- |
I have no idea and I can't reproduce (it doesn't seem to have anything to do with the |
can either of you post output of updated ci/print_versions.py from here |
|
INSTALLED VERSIONSPython: 2.7.3.final.0 have stats |
yeah, nothing there.
tried in a virtualenv, and also a fresh clone - still failing. |
When you say "sporadic", how often is it occurring? Are you running the multi-processing test suite or regular? |
here's a break on failed comparison in from test_Rolling_median->check_moment_func->check_ndarray: > /home/user1/src/pandas/pandas/stats/tests/test_moments.py(387)_check_ndarray()
-> self.assert_(np.isnan(result[14]))
(Pdb) l
382 self.assert_(np.isnan(result[-9:]).all())
383 else:
384 self.assert_((result[-9:] == 0).all())
385 if has_min_periods:
386 self.assert_(np.isnan(expected[23]))
387 -> self.assert_(np.isnan(result[14]))
388 self.assert_(np.isnan(expected[-5]))
389 self.assert_(np.isnan(result[-14]))
390
391 def _check_structures(self, func, static_comp,
392 has_min_periods=True, has_time_rule=True,
(Pdb) result
array([ nan, nan, nan, nan, 0.37457507,
0.30156993, 0.22856479, 0.30628624, 0.22856479, nan,
nan, nan, nan, 0.37457507, 0.30156993,
0.22856479, 0.30628624, 0.22856479, nan, nan,
nan, nan, 0.37457507, 0.30156993, 0.22856479,
0.30628624, 0.22856479, nan, nan, nan,
nan, 0.37457507, 0.30156993, 0.22856479, 0.30628624,
0.22856479, nan, nan, nan, nan,
nan, nan, nan, nan, nan,
nan, nan, nan, nan, nan])
(Pdb) expected
array([ nan, nan, nan, nan, nan,
nan, nan, nan, nan, nan,
nan, nan, nan, nan, nan,
nan, nan, nan, nan, nan,
nan, nan, nan, nan, 0.06942868,
0.05451945, 0.06942868, 0.14899674, 0.22856479, 0.30156993,
0.14899674, 0.30156993, 0.30156993, 0.30156993, 0.14899674,
0.30156993, 0.30156993, 0.37929138, 0.37929138, 0.37929138,
0.37457507, 0.30156993, 0.22856479, 0.30628624, 0.22856479,
nan, nan, nan, nan, nan]) |
Try replacing _center_window in moments.py with this:
|
no effect. |
The result looks completely mangled. Can you pdb into _center_window and see what's going on? |
rs at entry to center_window in test_rolling_apply
rs returned from center_window in test_rolling_apply
|
is this the proper behaviour for shift? mkdf(10,1,r_idx_type='dt')
Out[10]:
C0 C_l0_g0
R0
2000-01-03 R0C0
2000-01-04 R1C0
2000-01-05 R2C0
2000-01-06 R3C0
2000-01-07 R4C0
2000-01-10 R5C0
2000-01-11 R6C0
2000-01-12 R7C0
2000-01-13 R8C0
2000-01-14 R9C0
a=mkdf(10,1,r_idx_type='dt')
a.shift()
Out[12]:
C0 C_l0_g0
R0
2000-01-03 NaN
2000-01-04 R0C0
2000-01-05 R1C0
2000-01-06 R2C0
2000-01-07 R3C0
2000-01-10 R4C0
2000-01-11 R5C0
2000-01-12 R6C0
2000-01-13 R7C0
2000-01-14 R8C0 |
yes, but _center_window implements shift for ndarrays separately. take a look at the following: rs[tuple(rs_indexer)] And also what each *_indexer is |
i think that's ok
|
nosetests bug? |
λ nosetests --version |
same. |
Ok, try replacing _center_window with the following:
|
no effect with def _center_window(rs, window, axis):
offset = int((window - 1) / 2.)
if isinstance(rs, (Series, DataFrame, Panel)):
rs = rs.shift(-offset, axis=axis)
elif rs.ndim == 1:
rs[:-offset] = rs[offset:]
rs[-offset:] = np.nan
else:
rs_indexer = [slice(None)] * rs.ndim
rs_indexer[axis] = slice(None, -offset)
lead_indexer = [slice(None)] * rs.ndim
lead_indexer[axis] = slice(offset, None)
na_indexer = [slice(None)] * rs.ndim
na_indexer[axis] = slice(-offset, None)
rs[rs_indexer] = rs[lead_indexer]
rs[na_indexer] = np.nan
return rs |
unless you've sniffed something, let's put it away for a day or two and see |
Yeah, I even installed numpy 1.6.2 just in case but could not repro On Dec 11, 2012, at 4:28 PM, y-p [email protected] wrote:
|
have you tried diagnosing again? Hate to waste more time on this but it really bothers me that you're running into it and we can't repro. Our windows binaries are built using a Jenkins CI box and that has 2.6-3.2 both 32/64bit on windows and they all passed there...We've also tested on 64bit OSX and Ubuntu here between my and Wes's setup. |
Yaroslav just e-mailed me having run into this also. maybe that will provide some insight |
No I haven't. |
FWIW -- now also got the "Arrays are not almost equal to 5 decimals expected 13.00000 but got 1.00000" too on amd64 wheezy. |
I could debug this pretty quickly if I can get access to a box where it fails |
yeap... let me update/check on sparc box (meanwhile you could email me your public ssh key and desired login name as the continuation of the "private" thread we already had ;) ) |
fair enough. |
weird, I just started getting consistent |
peek at /var/log/dpkg.log if you need to discover what got updated any On Thu, 13 Dec 2012, y-p wrote:
Yaroslav O. Halchenko |
nothing jumps out - no libc change that I can see. maybe |
The error has to be in |
*edit: submit button slipped, edited." I noticed strange phenomena such as that adding print statements altered the probability I then modified def _center_window(rs, window, axis):
# print( rs,window,axis)
offset = int((window - 1) / 2.)
if isinstance(rs, (Series, DataFrame, Panel)):
rs = rs.shift(-offset, axis=axis)
else:
rs_indexer = [slice(None) for x in range(rs.ndim)] # waldo1
print( axis,offset)
rs_indexer[axis] = slice(None, -offset)
lead_indexer = [slice(None) for x in range(rs.ndim)]
lead_indexer[axis] = slice(offset, None)
na_indexer = [slice(None) for x in range(rs.ndim)]
na_indexer[axis] = slice(-offset, None)
x = rs[tuple(lead_indexer)]
print( "x",x)
print( "idx",rs_indexer)
print("rs-before", rs)
y=list(rs_indexer)
print( "y",y)
rs[y] = x #waldo2
print("rs-after", rs)
rs[tuple(na_indexer)] = np.nan
return rs inserted a div by zero into test_rolling_count so that the test always fails
Look at the code line marked "waldo2". I suspected that the previously used lines at "waldo1", which was
was actualy creating multiple refs to the same object and therfore changed it to a list comprehension so it looks like the line at "waldo2", originally: rs[tuple(rs_indexer)] = rs[tuple(lead_indexer)] is not determinstic even though it's inputs are, or that the inputs only appear determinstic but there is |
pandas issue #2490 numpy/numpy#324 manifests only on certain distors, probably libc dependent. debian wheezy, libc 2.13-37 is affected ubuntu precise 2.15-0ubuntu10.2 is not These are not necessarily the earliest/latest package revisions addected/not affected respectively.
Upgrading to 1.7.0b2 fixes the problem, this jives with me not experiencing the issue with I bisected things down on numpy to this commit, The PR fixes a bug having to do with overlapping source and destination memcpy, which The fact that it's a problem with memcpy also explains why debian wheezy shows the problem Doesn't look like the fix was backported to 1.6.2. |
@y-p ---- so what should be done on the numpy side? |
cc @yarikoptic so he's aware that we're looking at an upstream numpy bug in 1.6.x |
@certik , backport to maintenance/1.6.x? numpy/numpy@0920bed cherry-picks |
@y-p, would you mind sending a PR for the 1.6.x? |
@y-p thanks**1e6 |
closing this bad boy if that works. thanks all |
I'm getting sporadic failures in
FAIL: test_rolling_std (pandas.stats.tests.test_moments.TestMoments)
FAIL: test_rolling_sum (pandas.stats.tests.test_moments.TestMoments)
FAIL: test_rolling_kurt (pandas.stats.tests.test_moments.TestMoments)
FAIL: test_rolling_max (pandas.stats.tests.test_moments.TestMoments)
and related
git bisected to: 915d261
I've upgraded to statsmodels '0.5.0.dev-c9062e4', did a clean/develop
but am still getting failures.
any ideas? @changhiskhan
The text was updated successfully, but these errors were encountered: