Skip to content

BUG: (linear) interpolation after resampling #18189

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
Make42 opened this issue Nov 9, 2017 · 4 comments
Closed

BUG: (linear) interpolation after resampling #18189

Make42 opened this issue Nov 9, 2017 · 4 comments
Labels
Bug Resample resample method

Comments

@Make42
Copy link

Make42 commented Nov 9, 2017

When running

    import pandas as pd
    index = pd.date_range('1/1/2000', periods=9, freq='0.9S')
    series = pd.Series(range(9), index=index)
    
    >>> series
    2000-01-01 00:00:00.000    0
    2000-01-01 00:00:00.900    1
    2000-01-01 00:00:01.800    2
    2000-01-01 00:00:02.700    3
    2000-01-01 00:00:03.600    4
    2000-01-01 00:00:04.500    5
    2000-01-01 00:00:05.400    6
    2000-01-01 00:00:06.300    7
    2000-01-01 00:00:07.200    8
    Freq: 900L, dtype: int64

I get

    >>> series.resample(rule='0.5S').head(100)
    2000-01-01 00:00:00.000    0.0
    2000-01-01 00:00:00.500    1.0
    2000-01-01 00:00:01.000    NaN
    2000-01-01 00:00:01.500    2.0
    2000-01-01 00:00:02.000    NaN
    2000-01-01 00:00:02.500    3.0
    2000-01-01 00:00:03.000    NaN
    2000-01-01 00:00:03.500    4.0
    2000-01-01 00:00:04.000    NaN
    2000-01-01 00:00:04.500    5.0
    2000-01-01 00:00:05.000    6.0
    2000-01-01 00:00:05.500    NaN
    2000-01-01 00:00:06.000    7.0
    2000-01-01 00:00:06.500    NaN
    2000-01-01 00:00:07.000    8.0
    Freq: 500L, dtype: float64

However I do not expect to get

    >>> series.resample(rule='0.5S').interpolate(method='linear')
    2000-01-01 00:00:00.000    0.000000
    2000-01-01 00:00:00.500    0.555556
    2000-01-01 00:00:01.000    1.111111
    2000-01-01 00:00:01.500    1.666667
    2000-01-01 00:00:02.000    2.222222
    2000-01-01 00:00:02.500    2.777778
    2000-01-01 00:00:03.000    3.333333
    2000-01-01 00:00:03.500    3.888889
    2000-01-01 00:00:04.000    4.444444
    2000-01-01 00:00:04.500    5.000000
    2000-01-01 00:00:05.000    5.000000
    2000-01-01 00:00:05.500    5.000000
    2000-01-01 00:00:06.000    5.000000
    2000-01-01 00:00:06.500    5.000000
    2000-01-01 00:00:07.000    5.000000
    Freq: 500L, dtype: float64

instead I expect the last value to be still 8.0 and still 7.0 for the timestamp with 6.5 seconds.

I posted this on https://stackoverflow.com/q/46728152/4533188 and I was told this might be a bug - I got the same impression.

>>> pd.show_versions()
INSTALLED VERSIONS
------------------
commit: None
python: 3.6.2.final.0
python-bits: 64
OS: Linux
OS-release: 3.13.0-129-generic
machine: x86_64
processor: x86_64
byteorder: little
LC_ALL: None
LANG: de_DE.UTF-8
LOCALE: de_DE.UTF-8
pandas: 0.20.3
pytest: 3.2.1
pip: 9.0.1
setuptools: 36.2.2.post20170724
Cython: 0.26
numpy: 1.13.3
scipy: 0.19.1
xarray: None
IPython: None
sphinx: 1.6.3
patsy: 0.4.1
dateutil: 2.6.1
pytz: 2017.2
blosc: None
bottleneck: None
tables: None
numexpr: None
feather: None
matplotlib: 2.0.2
openpyxl: None
xlrd: None
xlwt: None
xlsxwriter: 0.9.8
lxml: None
bs4: None
html5lib: None
sqlalchemy: None
pymysql: None
psycopg2: None
jinja2: 2.9.6
s3fs: None
pandas_gbq: None
pandas_datareader: None
@AbdealiLoKo
Copy link
Contributor

With 0.21.0 I get the right result:

In [8]: series.resample(rule='0.5S').head(20).interpolate(method='linear')
Out[8]: 
2000-01-01 00:00:00.000    0.0
2000-01-01 00:00:00.500    1.0
2000-01-01 00:00:01.000    1.5
2000-01-01 00:00:01.500    2.0
2000-01-01 00:00:02.000    2.5
2000-01-01 00:00:02.500    3.0
2000-01-01 00:00:03.000    3.5
2000-01-01 00:00:03.500    4.0
2000-01-01 00:00:04.000    4.5
2000-01-01 00:00:04.500    5.0
2000-01-01 00:00:05.000    6.0
2000-01-01 00:00:05.500    6.5
2000-01-01 00:00:06.000    7.0
2000-01-01 00:00:06.500    7.5
2000-01-01 00:00:07.000    8.0
Freq: 500L, dtype: float64

@TomAugspurger
Copy link
Contributor

I think the idea is to avoid the need for .head.

IIRC, there's a similar issue about this, though I may be wrong. @Make42 could you search around for one?

@Make42
Copy link
Author

Make42 commented Nov 10, 2017

@TomAugspurger: I went through all issues with the search term "interpolate". I am not sure which one you mean. Could it be #16381 or #14297 ?

@gfyoung gfyoung added the Resample resample method label Nov 10, 2017
@mroeschke mroeschke added the Bug label May 11, 2020
@mroeschke
Copy link
Member

I think #14297 is a duplicate issue as this one. Going to close this issue in favor of that one

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Bug Resample resample method
Projects
None yet
Development

No branches or pull requests

5 participants