Skip to content

Resampling produces all NaN values for a particular dataset #9915

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
brownan opened this issue Apr 16, 2015 · 3 comments
Closed

Resampling produces all NaN values for a particular dataset #9915

brownan opened this issue Apr 16, 2015 · 3 comments
Labels
Bug Resample resample method
Milestone

Comments

@brownan
Copy link

brownan commented Apr 16, 2015

I ran into a particular dataset that seems to cause Series.resample() to produce incorrect results. Each datapoint is almost but not exactly a minute apart. (This is trimmed down from a larger dataset, but this still reproduces the problem)

>>> import pandas; from pandas import Timestamp
>>> data1 = pandas.Series(
... {
...  Timestamp('2015-03-31 21:48:52.672000'): 2,
...  Timestamp('2015-03-31 21:49:52.739000'): 1,
...  Timestamp('2015-03-31 21:50:52.806000'): 2}
... )
>>> data1
2015-03-31 21:48:52.672000    2
2015-03-31 21:49:52.739000    1
2015-03-31 21:50:52.806000    2
dtype: int64
>>> data1.resample("10S")
2015-03-31 21:48:50   NaN
2015-03-31 21:49:00   NaN
2015-03-31 21:49:10   NaN
2015-03-31 21:49:20   NaN
2015-03-31 21:49:30   NaN
2015-03-31 21:49:40   NaN
2015-03-31 21:49:50   NaN
2015-03-31 21:50:00   NaN
2015-03-31 21:50:10   NaN
2015-03-31 21:50:20   NaN
2015-03-31 21:50:30   NaN
2015-03-31 21:50:40   NaN
2015-03-31 21:50:50   NaN
Freq: 10S, dtype: float64

Expected output:

>>> data1.resample("10S")
2015-03-31 21:48:50   2
2015-03-31 21:49:00   NaN
2015-03-31 21:49:10   NaN
2015-03-31 21:49:20   NaN
2015-03-31 21:49:30   NaN
2015-03-31 21:49:40   NaN
2015-03-31 21:49:50   1
2015-03-31 21:50:00   NaN
2015-03-31 21:50:10   NaN
2015-03-31 21:50:20   NaN
2015-03-31 21:50:30   NaN
2015-03-31 21:50:40   NaN
2015-03-31 21:50:50   2
Freq: 10S, dtype: float64

Changing any one of the timestamps or taking one of the two datapoints out works correctly. Something about this particular dataset causes resample to fail.

I have more examples on this ipython notebook: http://nbviewer.ipython.org/gist/brownan/1000ba324ad7df917e32

INSTALLED VERSIONS
------------------
commit: None
python: 2.7.5.final.0
python-bits: 64
OS: Linux
OS-release: 3.10.0-123.20.1.el7.x86_64
machine: x86_64
processor: x86_64
byteorder: little
LC_ALL: None
LANG: en_US.UTF-8

pandas: 0.16.0
nose: 1.3.4
Cython: None
numpy: 1.9.2
scipy: None
statsmodels: None
IPython: 3.0.0
sphinx: 1.3.1
patsy: None
dateutil: 2.4.1
pytz: 2015.2
bottleneck: None
tables: None
numexpr: None
matplotlib: 1.4.3
openpyxl: None
xlrd: None
xlwt: None
xlsxwriter: None
lxml: None
bs4: None
html5lib: None
httplib2: None
apiclient: None
sqlalchemy: None
pymysql: None
psycopg2: None

edit: I can also reproduce this with the latest version on master: 0.16.0-157-g161f38d

@TomAugspurger
Copy link
Contributor

Looks like we forget to pass along a kwarg, or our inference for downsampling is buggy:

In [19]: data1.resample('10S', how='mean')
Out[19]: 
2015-03-31 21:48:50     2
2015-03-31 21:49:00   NaN
2015-03-31 21:49:10   NaN
2015-03-31 21:49:20   NaN
2015-03-31 21:49:30   NaN
2015-03-31 21:49:40   NaN
2015-03-31 21:49:50     1
2015-03-31 21:50:00   NaN
2015-03-31 21:50:10   NaN
2015-03-31 21:50:20   NaN
2015-03-31 21:50:30   NaN
2015-03-31 21:50:40   NaN
2015-03-31 21:50:50     2
Freq: 10S, dtype: float64

Passing how='mean' explicitly works, but that should be the default here.

@TomAugspurger TomAugspurger added Bug Resample resample method labels Apr 17, 2015
@TomAugspurger TomAugspurger added this to the 0.16.1 milestone Apr 17, 2015
@jreback jreback modified the milestones: 0.16.1, Next Major Release Apr 28, 2015
@gliptak
Copy link
Contributor

gliptak commented Apr 16, 2016

This shows expected result running with pandas: 0.18.0+114.g6c692ae.dirty

@jorisvandenbossche
Copy link
Member

@gliptak Do you want to do a PR to explicitly add a test for this to confirm that it is fixed and to close this issue?

@jreback jreback modified the milestones: 0.18.1, Next Major Release Apr 17, 2016
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Bug Resample resample method
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants