Skip to content

rolling with datetimeIndex is not selecting anything when the row value is NaN itself #17049

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
staftermath opened this issue Jul 21, 2017 · 2 comments
Labels
Bug Datetime Datetime data dtype Duplicate Report Duplicate issue or pull request Missing-data np.nan, pd.NaT, pd.NA, dropna, isnull, interpolate Reshaping Concat, Merge/Join, Stack/Unstack, Explode

Comments

@staftermath
Copy link
Contributor

staftermath commented Jul 21, 2017

Code Sample, a copy-pastable example if possible

import numpy as np
import pandas as pd
import datetime

df = pd.DataFrame({'a': [1, np.nan, 5],
                   'datetime':pd.to_datetime(['20170403', '20170404', '20170405'])})
df.set_index('datetime', inplace=True)
df.rolling('2d', 1).apply(np.nanmin)

Problem description

the above generates

              a
datetime       
2017-04-03  1.0
2017-04-04  NaN
2017-04-05  5.0

when use datetimeIndex, if a row has NaN value it is not fed to input of apply when creating output for that row. To illustrate this, I use:

def foo(x):
    print(x)
    return np.nanmin(x)
print(df.rolling('2d', 1).apply(foo))

and got printed value:

[ 1.]
[ nan   5.]

the second row is skipped.
This issue doesn't not exist when integer index is used:

df = pd.DataFrame({'a': [1, np.nan, 5]})
def foo(x):
    print(x)
    return np.nanmin(x)
print(df.rolling(2, 1).apply(foo))

print the following:

[ 1.]
[  1.  nan]
[ nan   5.]

and yields correct results

Expected Output

              a
datetime       
2017-04-03  1.0
2017-04-04  1.0
2017-04-05  5.0

Output of pd.show_versions()

INSTALLED VERSIONS

commit: None
python: 3.5.2.final.0
python-bits: 64
OS: Darwin
OS-release: 15.6.0
machine: x86_64
processor: i386
byteorder: little
LC_ALL: None
LANG: None
LOCALE: en_US.UTF-8

pandas: 0.20.3
pytest: 2.8.5
pip: 8.1.2
setuptools: 25.1.6
Cython: 0.23.4
numpy: 1.13.1
scipy: 0.17.0
xarray: None
IPython: 5.1.0
sphinx: 1.4.1
patsy: 0.4.0
dateutil: 2.6.1
pytz: 2017.2
blosc: None
bottleneck: 1.0.0
tables: 3.2.2
numexpr: 2.4.6
feather: None
matplotlib: 1.5.1
openpyxl: 2.3.2
xlrd: 0.9.4
xlwt: 1.0.0
xlsxwriter: 0.8.4
lxml: 3.5.0
bs4: 4.4.1
html5lib: None
sqlalchemy: 1.0.11
pymysql: None
psycopg2: None
jinja2: 2.8
s3fs: None
pandas_gbq: None
pandas_datareader: None

@gfyoung gfyoung added Bug Missing-data np.nan, pd.NaT, pd.NA, dropna, isnull, interpolate labels Jul 21, 2017
@gfyoung
Copy link
Member

gfyoung commented Jul 21, 2017

@staftermath : That does look a little strange. Investigation and PR to patch are welcome!

@jreback jreback added Duplicate Report Duplicate issue or pull request Reshaping Concat, Merge/Join, Stack/Unstack, Explode Datetime Datetime data dtype labels Jul 21, 2017
@jreback jreback added this to the No action milestone Jul 21, 2017
@jreback
Copy link
Contributor

jreback commented Jul 21, 2017

duplicate of this: #15305

fixes are welcome.

@jreback jreback closed this as completed Jul 21, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Bug Datetime Datetime data dtype Duplicate Report Duplicate issue or pull request Missing-data np.nan, pd.NaT, pd.NA, dropna, isnull, interpolate Reshaping Concat, Merge/Join, Stack/Unstack, Explode
Projects
None yet
Development

No branches or pull requests

3 participants