-
-
Notifications
You must be signed in to change notification settings - Fork 18.4k
.rolling().std() only returns NaN in Python3.7 #21786
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
Hmm I can't reproduce with Python 3.7 and pandas 0.23.2 In [1]: paste
import pandas as pd
d = {"col": [1, 23, 231, 231, 4, 353, 62, 3, 56, 43, 354, 43, 231, 21, 7]}
df = pd.DataFrame(data=d)
std = df["col"].std()
df["mean5"] = df["col"].rolling(5).mean()
df["std5"] = df["col"].rolling(5).std()
print(std)
print(df[["mean5", "std5"]])
## -- End pasted text --
130.20855066648528
mean5 std5
0 NaN NaN
1 NaN NaN
2 NaN NaN
3 NaN NaN
4 98.0 121.704560
5 168.4 150.069317
6 176.2 141.384936
7 130.6 155.368272
8 95.6 146.558180
9 103.4 141.411810
10 103.6 141.853093
11 99.8 143.491115
12 145.4 141.249071
13 138.4 147.515423
14 131.2 154.068816
In [2]: pd.__version__
Out[2]: '0.23.2' |
Weird, what do we do now? Could you share your |
Perhaps someone else will be able to reproduce. A couple questions
|
I use pip. When I try to install 0.23.1 I get the error |
Probably something with your path... @chris-b1 any chance this is related to the C++ build stuff? Can you try on 3.7 with the wheels on PyPI? |
Can you link me to a pandas version 0.23.1 for python 3.7? Here I only find versions up to 3.6. |
0.23.1 didn’t support python 3.7. There aren’t any 3.7 wheels for it.
…________________________________
From: FW <[email protected]>
Sent: Saturday, July 7, 2018 12:41:43 PM
To: pandas-dev/pandas
Cc: Tom Augspurger; Comment
Subject: Re: [pandas-dev/pandas] .rolling().std() only returns NaN in Python3.7 (#21786)
Can you link me to a pandas version 0.23.1 for python 3.7? Here<https://pypi.org/project/pandas/0.23.1/#files> I only find versions up to 3.6.
—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub<#21786 (comment)>, or mute the thread<https://github.com/notifications/unsubscribe-auth/ABQHIpKNAZwR1mDdMNINrzF2yIyFJtt9ks5uEPLXgaJpZM4VGVKc>.
|
I can not try with 0.23.1 then because the error only happens with python 3.7 for me |
I reproduce on Windows 10 with 3.7 and the 0.23.1 wheel on PyPI, or building from source of PyPI ( At first review I'm stumped - the rolling var calc seems to only use c-math / pointers, nothing from libc[++]. pandas/pandas/_libs/window.pyx Line 746 in 8bee97a
|
Hmm, this is unfortunate.
Do we have any ideas on how to proceed?
Could someone on windows try this with the conda / conda-forge packages?
…On Sat, Jul 7, 2018 at 8:51 PM, chris-b1 ***@***.***> wrote:
I reproduce on Windows 10 with 3.7 and the 0.23.1 wheel on PyPI, or
building from source of PyPI (--no-binary pandas) or off of tagged
release from cython.
At first review I'm stumped - the rolling var calc seems to only use
c-math / pointers, nothing from libc[++].
https://github.com/pandas-dev/pandas/blob/8bee97a81764c5211d719d61a62424
b1edfacd80/pandas/_libs/window.pyx#L746
—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub
<#21786 (comment)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/ABQHIm9C5sVdw4MXbThzXjltSh2aU0_1ks5uEWWugaJpZM4VGVKc>
.
|
import pandas as pd
d = {"col": [1, 23, 231, 231, 4, 353, 62, 3, 56, 43, 354, 43, 231, 21, 7]}
df = pd.DataFrame(data=d)
f = lambda x: sum(x, 1)
funcs = [
"count()",
"sum()",
"mean()",
"median()",
"var()",
"std()",
"min()",
"max()",
"corr(df['col'])",
"cov()",
"skew()",
"kurt()",
"apply(f)",
"agg({'col':'sum'})",
"agg({'col':'std'})",
"quantile(0.3)",
"ndim",
"is_datetimelike",
"is_freq_type",
]
for func in funcs:
df[func] = eval(f"df['col'].rolling(5).{func}")
print(df)
#OUTPUT
col count() sum() mean() median() var() std() min() max() corr(df['col']) cov() skew() kurt() apply(f) agg({'col':'sum'}) agg({'col':'std'}) quantile(0.3) ndim is_datetimelike is_freq_type
0 1 1.0 NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN 1 False False
1 23 2.0 NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN 1 False False
2 231 3.0 NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN 1 False False
3 231 4.0 NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN 1 False False
4 4 5.0 490.0 98.0 23.0 NaN NaN 1.0 231.0 NaN 14812.0 0.58711909 -3.30501520 491.0 490.0 NaN 7.8 1 False False
5 353 5.0 842.0 168.4 231.0 NaN NaN 4.0 353.0 NaN 22520.8 -0.09073220 -2.16044852 843.0 842.0 NaN 64.6 1 False False
6 62 5.0 881.0 176.2 231.0 NaN NaN 4.0 353.0 NaN 19989.7 -0.10909425 -1.60438496 882.0 881.0 NaN 95.8 1 False False
7 3 5.0 653.0 130.6 62.0 NaN NaN 3.0 353.0 NaN 24139.3 0.84243331 -1.36672178 654.0 653.0 NaN 15.6 1 False False
8 56 5.0 478.0 95.6 56.0 NaN NaN 3.0 353.0 NaN 21479.3 2.03720693 4.29341424 479.0 478.0 NaN 14.4 1 False False
9 43 5.0 517.0 103.4 56.0 NaN NaN 3.0 353.0 NaN 19997.3 2.08348019 4.51654869 518.0 517.0 NaN 45.6 1 False False
10 354 5.0 518.0 103.6 56.0 NaN NaN 3.0 354.0 NaN 20122.3 2.08443877 4.51937764 519.0 518.0 NaN 45.6 1 False False
11 43 5.0 499.0 99.8 43.0 NaN NaN 3.0 354.0 NaN 20589.7 2.12508425 4.64265427 500.0 499.0 NaN 43.0 1 False False
12 231 5.0 727.0 145.4 56.0 NaN NaN 43.0 354.0 NaN 19951.3 1.01164898 -0.99425147 728.0 727.0 NaN 45.6 1 False False
13 21 5.0 692.0 138.4 43.0 NaN NaN 21.0 354.0 NaN 21760.8 0.96847264 -1.16346708 693.0 692.0 NaN 43.0 1 False False
14 7 5.0 656.0 131.2 43.0 NaN NaN 7.0 354.0 NaN 23737.2 0.92438493 -1.32408409 657.0 656.0 NaN 25.4 1 False False std = df["col"].std()
var = df["col"].var()
corr = df["col"].corr(df["col"])
print(std, var, corr)
#OUTPUT (work as intended without the rolling window)
130.20855066648528 16954.266666666666 1.0 |
Thanks @tweakimp - those both also call the same cython routine ( I've got it partially figured out - seems this branch is being (incorrectly) optimized out on the latest MSVC, works again if I throw a printf in there. Still trying to riddle out if it's a compiler bug, or some flag set wrong. pandas/pandas/_libs/window.pyx Line 659 in 8bee97a
|
hmm, turning off the (pandas-dev37) λ git diff setup.py
diff --git a/setup.py b/setup.py
index 8018d71b7..5535dd23f 100755
--- a/setup.py
+++ b/setup.py
@@ -455,7 +455,8 @@ def pxd(name):
if is_platform_windows():
- extra_compile_args = []
+ extra_compile_args = ['/GL-'] |
Workaround: .apply(lambda x: pd.np.std(x)) import pandas as pd
d = {"col": [1, 23, 231, 231, 4, 353, 62, 3, 56, 43, 354, 43, 231, 21, 7]}
df = pd.DataFrame(data=d)
df["std5"] = df["col"].rolling(5).apply(lambda x: pd.np.std(x))
print(df["std5"])
# OUTPUT (as expected)
0 NaN
1 NaN
2 NaN
3 NaN
4 108.855868
5 134.226078
6 126.458531
7 138.965607
8 131.085621
9 126.482568
10 126.877264
11 128.342355
12 126.337010
13 131.941805
14 137.803338 |
Yep, do note that |
can anyone suggest a similar workaround for rolling correlation between two columns of a dataframe? |
Will downgrading to msvc 2015 help my case? |
@psicktrick - if you are building from source, yes using MSVC 2015 will solve this issue, or you can also build from master, which has a fix applied. As far as I know, not an easy workaround for corr, but should be a release of 0.23.4 soon (#22128) |
Problem description
.std()
and.rolling().mean()
work as intended, but.rolling().std()
only returns NaNI just upgraded from Python 3.6.5 where the same code did work perfectly.
I am now on Python 3.7, pandas 0.23.2
Expected Output
Output of
pd.show_versions()
INSTALLED VERSIONS
commit: None
python: 3.7.0.final.0
python-bits: 64
OS: Windows
OS-release: 10
machine: AMD64
processor: Intel64 Family 6 Model 94 Stepping 3, GenuineIntel
byteorder: little
LC_ALL: None
LANG: en_US.UTF-8
LOCALE: None.None
pandas: 0.23.2
pytest: None
pip: 10.0.1
setuptools: 39.0.1
Cython: None
numpy: 1.14.5
scipy: None
pyarrow: None
xarray: None
IPython: None
sphinx: None
patsy: None
dateutil: 2.7.3
pytz: 2018.5
blosc: None
bottleneck: None
tables: None
numexpr: None
feather: None
matplotlib: None
openpyxl: None
xlrd: None
xlwt: None
xlsxwriter: None
lxml: None
bs4: 4.6.0
html5lib: None
sqlalchemy: None
pymysql: None
psycopg2: None
jinja2: None
s3fs: None
fastparquet: None
pandas_gbq: None
pandas_datareader: None
None
The text was updated successfully, but these errors were encountered: