Failing interpolation test #5174

cpcloud · 2013-10-10T16:49:41Z

$ nosetests pandas/tests/test_generic.py:TestSeries.test_interp_quad
F
======================================================================
FAIL: test_interp_quad (pandas.tests.test_generic.TestSeries)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/home/phillip/Documents/code/py/pandas/pandas/tests/test_generic.py", line 339, in test_interp_quad
    assert_series_equal(result, expected)
  File "/home/phillip/Documents/code/py/pandas/pandas/util/testing.py", line 452, in assert_series_equal
    assert_attr_equal('dtype', left, right)
  File "/home/phillip/Documents/code/py/pandas/pandas/util/testing.py", line 369, in assert_attr_equal
    assert_equal(left_attr,right_attr,"attr is not equal [{0}]" .format(attr))
  File "/home/phillip/Documents/code/py/pandas/pandas/util/testing.py", line 354, in assert_equal
    assert a == b, "%s: %r != %r" % (msg.format(a,b), a, b)
AssertionError: attr is not equal [dtype]: dtype('int64') != dtype('float64')

----------------------------------------------------------------------
Ran 1 test in 0.041s

FAILED (failures=1)

The text was updated successfully, but these errors were encountered:

cpcloud · 2013-10-10T16:49:51Z

cc @TomAugspurger

TomAugspurger · 2013-10-10T17:10:45Z

Does this make any sense? A float block with array of items with int dtype?

ipdb> self
SingleBlockManager
Items: Int64Index([1, 2, 3, 4], dtype=int64)
FloatBlock: 4 dtype: float64

I'm in core.internals.apply

jreback · 2013-10-10T17:11:20Z

items are the 'index'...so that is right

jreback · 2013-10-10T17:13:54Z

@cpcloud where do you see this failing? I can't repro on 64 or 32-bit

jreback · 2013-10-10T17:14:22Z

spoke too soon!

jreback · 2013-10-10T17:16:38Z

@TomAugspurger that test just needs to have the expected be int64 otherwise looks fine. as an FYI, maybe need
some tests that don't infer dtypes (e.g. set downcast=False to have no inferring of the results)

TomAugspurger · 2013-10-10T17:22:42Z

So are you saying to change expected to expected = Series([1, 4, 9, 16], index=[1, 2, 3, 4]) (int type), because that fails for me. I'm trying to figure out why the result [1., 4., 9., 16.] doesn't get downcast for me right now.

jreback · 2013-10-10T17:22:54Z

@TomAugspurger you may also want to put some more creative logic in there for inference. Since we know that we are going to only float/int coming in, you could always infer the ints so that you will get ints if possible and the floats will stay floats.

jreback · 2013-10-10T17:28:36Z

@TomAugspurger the result IS downcast (to int64), its the expected that is float64

TomAugspurger · 2013-10-10T17:33:16Z

@TomAugspurger the result IS downcast (to int64), its the expected that is float64

Not for me:

In [5]: result = sq.interpolate(method='quadratic')
In [6]: result
Out[6]: 
1     1
2     4
3     9
4    16
dtype: float64

Can you clear this up for me? I think this is where things aren't going the same way. b is the float block with the nan interpolated.

ipdb> !b
FloatBlock: 4 dtype: float64
ipdb> !b.values
array([  1.,   4.,   9.,  16.])
ipdb> !b.downcast(downcast)[0].values  # should be ints?
array([  1.,   4.,   9.,  16.])
ipdb> downcast
'infer'

That's in pandas/core/internals.py(337)_maybe_downcast()

I'll dig a bit deeper.

TomAugspurger · 2013-10-10T17:44:48Z

umm... in /pandas/core/common.py(1064)_possibly_downcast_to_dtype():

ipdb> result
array([  1.,   4.,   9.,  16.])
ipdb> result.astype(dtype)
array([ 1,  4,  8, 16])
ipdb> dtype
dtype('int64')
ipdb>

but back at in the interpreter:

In [10]: a.astype(np.int64)
Out[10]: array([ 1,  4,  9, 16])

In [11]: a = np.array([1., 4., 9., 16.])

In [12]: a.astype(np.int64)
Out[12]: array([ 1,  4,  9, 16])

jreback · 2013-10-10T17:45:44Z

This is a precision issue

array([  1.,   4.,   9.,  16.])
(Pdb) p result[0]
1.0
(Pdb) p result[1]
4.0
(Pdb) p result[2]
9.0000000000000036
(Pdb) p result[3]
16.0

thus this array is NOT equal to array([1,4,9,16])

thus should not be downcasted (though you can make a case that it close 'enough') to be....

(Pdb) result == new_result
array([ True,  True, False,  True], dtype=bool)
(Pdb) result.round(8) == new_result
array([ True,  True,  True,  True], dtype=bool)

should we round when trying to downcast to int?

jreback · 2013-10-10T17:49:53Z

I think I should just do allclose with the default tolerances (1e-5,1e-8).....

TomAugspurger · 2013-10-10T18:00:17Z

Fair enough. And users can override that with s.interpolate(…, infer=False) right? Where would the necessary changes need to be made?

TomAugspurger · 2013-10-10T18:01:24Z

Or were you saying allclose for the test?

jreback · 2013-10-10T18:05:18Z

yes...they can specify infer=False to turn off downcasting; I am going to put up a PR to basically use allclose to figure out if the values are downcastable, so going to change your test.

jreback · 2013-10-10T18:59:00Z

see #5177 I think that should do it

jreback · 2013-10-10T19:54:54Z

@TomAugspurger see if you think you need tests with infer=False (you may not)....

yarikoptic · 2013-10-28T15:32:54Z

jsut did on v0.12.0-993-gda89834

======================================================================
FAIL: test_interp_quad (pandas.tests.test_generic.TestSeries)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/home/yoh/deb/gits/pkg-exppsy/pandas/pandas/tests/test_generic.py", line 483, in test_interp_quad
    assert_series_equal(result, expected)
  File "/home/yoh/deb/gits/pkg-exppsy/pandas/pandas/util/testing.py", line 416, in assert_series_equal
    assert_attr_equal('dtype', left, right)
  File "/home/yoh/deb/gits/pkg-exppsy/pandas/pandas/util/testing.py", line 399, in assert_attr_equal
    assert_equal(left_attr,right_attr,"attr is not equal [{0}]" .format(attr))
  File "/home/yoh/deb/gits/pkg-exppsy/pandas/pandas/util/testing.py", line 382, in assert_equal
    assert a == b, "%s: %r != %r" % (msg.format(a,b), a, b)
AssertionError: attr is not equal [dtype]: dtype('float64') != dtype('int64')

jreback · 2013-10-28T15:35:05Z

@yarikoptic can you show ci/print_versions?

yarikoptic · 2013-10-28T15:40:51Z

$> ci/print_versions.py 

INSTALLED VERSIONS
------------------
Python: 2.7.5.final.0
OS: Linux 3.9-1-amd64 #1 SMP Debian 3.9.8-1 x86_64
byteorder: little
LC_ALL: None
LANG: en_US

pandas: 0.12.0.dev-09e62f5
Cython: 0.19.1
Numpy: 1.7.1
Scipy: 0.12.0
statsmodels: 0.6.0.dev-d11bf99
    patsy: 0.1.0+dev
scikits.timeseries: Not installed
dateutil: 1.5
pytz: 2012c
bottleneck: Not installed
PyTables: 2.4.0
    numexpr: 2.0.1
matplotlib: 1.3.1
openpyxl: 1.6.1
xlrd: 0.9.2
xlwt: 0.7.4
xlsxwriter: Not installed
sqlalchemy: 0.8.2
lxml: 3.2.0
bs4: 4.2.1
html5lib: 0.95-dev
bigquery: Not installed
apiclient: 1.2

jtratner · 2013-10-28T15:54:58Z

I've seen this consistently on OSX for the last week and a half as well.

jreback · 2013-10-28T15:55:16Z

what kind of machine is this/linux kernel?

jreback · 2013-10-28T15:58:00Z

@jtratner can you see what this does?

In [11]: np.allclose(np.array([9.0005]),np.array([9.]))
Out[11]: False

In [12]: np.allclose(np.array([9.00005]),np.array([9.]))
Out[12]: True

maybe need to put an argument there

jtratner · 2013-10-28T16:03:23Z

yes, won't have access until tonight.

On Mon, Oct 28, 2013 at 11:58 AM, jreback [email protected] wrote:

@jtratner https://github.com/jtratner can you see what this does?

In [11]: np.allclose(np.array([9.0005]),np.array([9.]))
Out[11]: False

In [12]: np.allclose(np.array([9.00005]),np.array([9.]))
Out[12]: True

maybe need to put an argument there

—
Reply to this email directly or view it on GitHubhttps://github.com//issues/5174#issuecomment-27224616
.

TomAugspurger · 2013-10-28T16:15:36Z

@jreback

In [1]: np.allclose(np.array([9.0005]),np.array([9.]))
Out[1]: False

In [2]: np.allclose(np.array([9.00005]),np.array([9.]))
Out[2]: True

But I haven't sorted out my failing scipy tests due to precision errors, so I'm not sure how reliable my results are.

jreback · 2013-10-28T16:52:37Z

@TomAugspurger are you showing this failure as well?

TomAugspurger · 2013-10-28T17:20:11Z

Yep. The way I had it written originally (with expected as a float) passed on my system.

Should I change expected to a float and set infer=False for this test?

jreback · 2013-10-28T17:30:52Z

can you step thru and see why its not coercing? (it does on my system and on travis),

put a break in com._possibly_downcast_to_dtype it SHOULD coerce to int64 in the existing test, lmk where it returns

TomAugspurger · 2013-10-28T18:00:36Z

I think that's what I posted up here When it tried to downcast the result' the 9 got flipped to an 8.

Let me know if you were asking for something different.

jreback · 2013-10-28T18:04:16Z

@TomAugspurger ahh...I c ....that is very odd....why would numpy flip the 9 float to an 8...(and only on mac)...

I guess let's just change the test, e.g. infer=False and compare vs float....can you do a quick PR for that?

jreback · 2013-10-28T19:53:25Z

@yarikoptic the fix @TomAugspurger put in should fix this problem....pls let us know and of course any other issues

jtratner · 2013-10-28T21:03:08Z

are you sure it's not a broader downcasting problem? and if it's a numpy
issue we should give them a heads-up. In particular, should test with numpy
1.6 and also Python 3 + numpy 1.7.

jreback · 2013-10-28T21:10:50Z

see toms example from above

I think it's a numpy bug (maybe only on Mac/Linux that's similar)

it's the astype which fails on precision (I think)

jtratner · 2013-10-28T21:21:20Z

I saw that, just wasn't clear whether the repr sometimes rounds values or
something...

jtratner · 2013-10-28T21:22:25Z

Also, the error itself seems confusing, given that it's not reproducible in
the interpreter...

jreback · 2013-10-28T21:37:22Z

I think it's 8.99999995 or something is getting astyped to 8 maybe I should round to like 5 decimal places first then astype

jtratner · 2013-10-28T21:44:35Z

So astype just floors it?

jreback · 2013-10-28T21:48:40Z

prob

I will something that can test with

TomAugspurger · 2013-10-28T22:02:53Z

I wasn't sure what to title this issue. You're right that it's probably a numpy
Issue. But for our part it's more just a reminder to remove the fix I just put in
On the interpolation test. I'll try to test it tomorrow.

-Tom

On Oct 28, 2013, at 16:48, "jreback" <[email protected]mailto:[email protected]> wrote:

prob

I will something that can test with

—
Reply to this email directly or view it on GitHubhttps://github.com//issues/5174#issuecomment-27260245.

jtratner · 2013-10-28T22:08:01Z

@jreback yep, you're right about the issue, so rounding would resolve it.

In [25]: arr = np.array([8.5, 8.6, 8.7, 8.8, 8.9999999999995])

In [26]: arr
Out[26]: array([ 8.5,  8.6,  8.7,  8.8,  9. ])

In [27]: arr.astype(int)
Out[27]: array([8, 8, 8, 8, 8])

jreback · 2013-10-28T22:10:03Z

yep it's an easy fix ;I think I had it at one point),but took it out as I assumed astype did round and not floor

jreback mentioned this issue Oct 10, 2013

API: make allclose comparison on dtype downcasting (GH5174) #5177

Merged

jreback closed this as completed in #5177 Oct 10, 2013

jreback reopened this Oct 28, 2013

TomAugspurger mentioned this issue Oct 28, 2013

TST: interpolate precision inference #5362

Merged

jreback closed this as completed in #5362 Oct 28, 2013

TomAugspurger mentioned this issue Oct 28, 2013

BUG/TST: Precision on downcast #5363

Closed

jreback mentioned this issue Oct 28, 2013

BUG: downcasting is now more robust (related GH5174) #5368

Merged

Failing interpolation test #5174

Failing interpolation test #5174

Comments

cpcloud commented Oct 10, 2013

cpcloud commented Oct 10, 2013

TomAugspurger commented Oct 10, 2013

jreback commented Oct 10, 2013

jreback commented Oct 10, 2013

jreback commented Oct 10, 2013

jreback commented Oct 10, 2013

TomAugspurger commented Oct 10, 2013

jreback commented Oct 10, 2013

jreback commented Oct 10, 2013

TomAugspurger commented Oct 10, 2013

TomAugspurger commented Oct 10, 2013

jreback commented Oct 10, 2013

jreback commented Oct 10, 2013

TomAugspurger commented Oct 10, 2013

TomAugspurger commented Oct 10, 2013

jreback commented Oct 10, 2013

jreback commented Oct 10, 2013

jreback commented Oct 10, 2013

yarikoptic commented Oct 28, 2013

jreback commented Oct 28, 2013

yarikoptic commented Oct 28, 2013

jtratner commented Oct 28, 2013

jreback commented Oct 28, 2013

jreback commented Oct 28, 2013

jtratner commented Oct 28, 2013

TomAugspurger commented Oct 28, 2013

jreback commented Oct 28, 2013

TomAugspurger commented Oct 28, 2013

jreback commented Oct 28, 2013

TomAugspurger commented Oct 28, 2013

jreback commented Oct 28, 2013

jreback commented Oct 28, 2013

jtratner commented Oct 28, 2013

jreback commented Oct 28, 2013

jtratner commented Oct 28, 2013

jtratner commented Oct 28, 2013

jreback commented Oct 28, 2013

jtratner commented Oct 28, 2013

jreback commented Oct 28, 2013

TomAugspurger commented Oct 28, 2013

jtratner commented Oct 28, 2013

jreback commented Oct 28, 2013