rolling with `datetimeIndex` is not selecting anything when the row value is NaN itself #17049

staftermath · 2017-07-21T21:18:31Z

Code Sample, a copy-pastable example if possible

import numpy as np
import pandas as pd
import datetime

df = pd.DataFrame({'a': [1, np.nan, 5],
                   'datetime':pd.to_datetime(['20170403', '20170404', '20170405'])})
df.set_index('datetime', inplace=True)
df.rolling('2d', 1).apply(np.nanmin)

Problem description

the above generates

              a
datetime       
2017-04-03  1.0
2017-04-04  NaN
2017-04-05  5.0

when use datetimeIndex, if a row has NaN value it is not fed to input of apply when creating output for that row. To illustrate this, I use:

def foo(x):
    print(x)
    return np.nanmin(x)
print(df.rolling('2d', 1).apply(foo))

and got printed value:

[ 1.]
[ nan   5.]

the second row is skipped.
This issue doesn't not exist when integer index is used:

df = pd.DataFrame({'a': [1, np.nan, 5]})
def foo(x):
    print(x)
    return np.nanmin(x)
print(df.rolling(2, 1).apply(foo))

print the following:

[ 1.]
[  1.  nan]
[ nan   5.]

and yields correct results

Expected Output

              a
datetime       
2017-04-03  1.0
2017-04-04  1.0
2017-04-05  5.0

Output of `pd.show_versions()`

INSTALLED VERSIONS

commit: None
python: 3.5.2.final.0
python-bits: 64
OS: Darwin
OS-release: 15.6.0
machine: x86_64
processor: i386
byteorder: little
LC_ALL: None
LANG: None
LOCALE: en_US.UTF-8

pandas: 0.20.3
pytest: 2.8.5
pip: 8.1.2
setuptools: 25.1.6
Cython: 0.23.4
numpy: 1.13.1
scipy: 0.17.0
xarray: None
IPython: 5.1.0
sphinx: 1.4.1
patsy: 0.4.0
dateutil: 2.6.1
pytz: 2017.2
blosc: None
bottleneck: 1.0.0
tables: 3.2.2
numexpr: 2.4.6
feather: None
matplotlib: 1.5.1
openpyxl: 2.3.2
xlrd: 0.9.4
xlwt: 1.0.0
xlsxwriter: 0.8.4
lxml: 3.5.0
bs4: 4.4.1
html5lib: None
sqlalchemy: 1.0.11
pymysql: None
psycopg2: None
jinja2: 2.8
s3fs: None
pandas_gbq: None
pandas_datareader: None

The text was updated successfully, but these errors were encountered:

gfyoung · 2017-07-21T21:35:07Z

@staftermath : That does look a little strange. Investigation and PR to patch are welcome!

jreback · 2017-07-21T23:41:37Z

duplicate of this: #15305

fixes are welcome.

gfyoung added Bug Missing-data np.nan, pd.NaT, pd.NA, dropna, isnull, interpolate labels Jul 21, 2017

jreback added Duplicate Report Duplicate issue or pull request Reshaping Concat, Merge/Join, Stack/Unstack, Explode Datetime Datetime data dtype labels Jul 21, 2017

jreback added this to the No action milestone Jul 21, 2017

jreback closed this as completed Jul 21, 2017

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

rolling with `datetimeIndex` is not selecting anything when the row value is NaN itself #17049

rolling with `datetimeIndex` is not selecting anything when the row value is NaN itself #17049

staftermath commented Jul 21, 2017 •

edited

Loading

gfyoung commented Jul 21, 2017

jreback commented Jul 21, 2017

rolling with datetimeIndex is not selecting anything when the row value is NaN itself #17049

rolling with datetimeIndex is not selecting anything when the row value is NaN itself #17049

Comments

staftermath commented Jul 21, 2017 • edited Loading

Code Sample, a copy-pastable example if possible

Problem description

Expected Output

Output of pd.show_versions()

gfyoung commented Jul 21, 2017

jreback commented Jul 21, 2017

rolling with `datetimeIndex` is not selecting anything when the row value is NaN itself #17049

rolling with `datetimeIndex` is not selecting anything when the row value is NaN itself #17049

staftermath commented Jul 21, 2017 •

edited

Loading

Output of `pd.show_versions()`