-
-
Notifications
You must be signed in to change notification settings - Fork 18.4k
Rolling max and min on datetime column incorrect when NaN in window #22931
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
Hmm I tried both Python2 and Python3 on macOS and was not able to reproduce this - @chris-b1 any chance this is a Windows-specific issue? |
This may have been solved by #21853? |
In response to this I created a fresh virtualenv with just the latest version of pandas and it's requirements and I was still having this issue. I'll try on a mint to see if I can replicate the issue or identify it as windows-only. |
What do you mean by latest version? Master or 0.23.4? That fix is in 0.24, which isn't released yet. |
Should've mentioned I updated the Output of pd.show_versions() in the original post. I mean version 0.23.4.
If you're referencing #21853 I'm unable to verify if this fixes the issue. |
I've replicated this issue on lubuntu. Here's the file and show_versions. It has the same output as OP: rolling_test.pyimport pandas as pd
import numpy as np
df = pd.DataFrame({'B': [0, 1, np.nan, 3, 4], 'C': [4, 3, np.nan, 1, 0],
'Time': [pd.Timestamp('20130101 09:00:00'),
pd.Timestamp('20130101 09:00:01'),
pd.Timestamp('20130101 09:00:02'),
pd.Timestamp('20130101 09:00:03'),
pd.Timestamp('20130101 09:00:04')]})
print('df:')
print(df)
print("max:")
print(df.rolling('4s', on='Time').max())
print('min:')
print(df.rolling('4s', on='Time').min()) pd.show_versions()INSTALLED VERSIONScommit: None pandas: 0.23.4 <\details> |
Can you try on master? Comments above suggest this may have already been solved so would be good to confirm whether or not that is the case |
Issue is resolved on master (pandas: 0.24.0.dev0+671.g08ecba8da) |
Code Sample, a copy-pastable example if possible
Problem description
When running a rolling max or min window on a datetime column, a NaN value seems to prevent the max or min function from considering values that follow it, even if those values are within the window.
dataframe
max() Output (column B)
Expected max() Output (column B)
min() Output (column C)
Expected min() Output (column C)
Output of
pd.show_versions()
INSTALLED VERSIONS
commit: None
python: 2.7.13.final.0
python-bits: 64
OS: Windows
OS-release: 10
machine: AMD64
processor: Intel64 Family 6 Model 78 Stepping 3, GenuineIntel
byteorder: little
LC_ALL: None
LANG: None
LOCALE: None.None
pandas: 0.23.4
pytest: None
pip: 18.0
setuptools: 40.4.3
Cython: None
numpy: 1.15.2
scipy: None
pyarrow: None
xarray: None
IPython: None
sphinx: None
patsy: None
dateutil: 2.7.3
pytz: 2018.5
blosc: None
bottleneck: None
tables: None
numexpr: None
feather: None
matplotlib: None
openpyxl: None
xlrd: None
xlwt: None
xlsxwriter: None
lxml: None
bs4: None
html5lib: None
sqlalchemy: None
pymysql: None
psycopg2: None
jinja2: None
s3fs: None
fastparquet: None
pandas_gbq: None
pandas_datareader: None
The text was updated successfully, but these errors were encountered: