BUG: DataFrame.rolling(axis=1) operations drop/ignore float16 and float32 columns #41779

benchittle · 2021-06-02T07:23:25Z

I have checked that this issue has not already been reported.
I have confirmed this bug exists on the latest version of pandas.
(optional) I have confirmed this bug exists on the master branch of pandas.

Code Sample, a copy-pastable example

import pandas as pd
import numpy as np

# Generate a 4x6 DataFrame
df = pd.DataFrame(np.arange(24).reshape(4, 6), columns=list("abcdef"))
# Make each column a different data type
df = df.astype({"a":"float16", "b":"float32", "c":"float64", "d":"int8", "e":"int16", "f":"int32"})

print(df)
# Output:
#      a     b     c   d   e   f
# 0   0.0   1.0   2.0   3   4   5
# 1   6.0   7.0   8.0   9  10  11
# 2  12.0  13.0  14.0  15  16  17
# 3  18.0  19.0  20.0  21  22  23
print(df.dtypes)
# Output:
# a    float16
# b    float32
# c    float64
# d       int8
# e      int16
# f      int32
# dtype: object

# Rolling minimum across rows
print(df.rolling(window=2, min_periods=1, axis=1).min())
# Output. Notice how the float16 and float32 columns were removed:
#       c     d     e     f
# 0   2.0   2.0   3.0   4.0
# 1   8.0   8.0   9.0  10.0
# 2  14.0  14.0  15.0  16.0
# 3  20.0  20.0  21.0  22.0

Problem description

It seems that rolling operations along rows (axis=1) incorrectly omit columns containing float16s and float32s. The same operations work as expected along columns (axis=0), however.

Expected Output

# Convert float16 and float32 columns to float64s as a workaround
df = df.astype({"a":"float64", "b":"float64"})
# Rolling minimum across rows again
print(df.rolling(window=2, min_periods=1, axis=1).min())
# Output:
#       a     b     c     d     e     f
# 0   0.0   0.0   1.0   2.0   3.0   4.0
# 1   6.0   6.0   7.0   8.0   9.0  10.0
# 2  12.0  12.0  13.0  14.0  15.0  16.0
# 3  18.0  18.0  19.0  20.0  21.0  22.0

Possible Cause

A change made in #36458, specifically this line.
It seems that "float" is an alias specifically for np.float64, not np.float32 or np.float16. Changing that line to
obj = obj.select_dtypes(include="number", exclude=["timedelta"])
to include all numeric values seemed to fix the issue in this case. I can open a PR if there don't seem to be any issues with this.

Output of `pd.show_versions()`

INSTALLED VERSIONS

commit : 2cb9652
python : 3.9.1.final.0
python-bits : 64
OS : Windows
OS-release : 10
Version : 10.0.19041
machine : AMD64
processor : AMD64 Family 23 Model 8 Stepping 2, AuthenticAMD
byteorder : little
LC_ALL : None
LANG : en_US.UTF-8
LOCALE : English_Canada.1252

pandas : 1.2.4
numpy : 1.20.3
pytz : 2021.1
dateutil : 2.8.1
pip : 21.1.2
setuptools : 57.0.0
Cython : None
pytest : None
hypothesis : None
sphinx : None
blosc : None
feather : None
xlsxwriter : None
lxml.etree : 4.6.3
html5lib : None
pymysql : None
psycopg2 : None
jinja2 : None
IPython : 7.24.0
pandas_datareader: None
bs4 : 4.9.3
bottleneck : None
fsspec : None
fastparquet : None
gcsfs : None
matplotlib : 3.4.2
numexpr : None
odfpy : None
openpyxl : 3.0.7
pandas_gbq : None
pyarrow : None
pyxlsb : None
s3fs : None
scipy : None
sqlalchemy : None
tables : None
tabulate : None
xarray : 0.18.2
xlrd : None
xlwt : None
numba : None

The text was updated successfully, but these errors were encountered:

Rolling operations along rows in DataFrames containing columns of floats remove/ignore float16 and float32 columns. See [this issue](pandas-dev/pandas#41779 (comment))

phofl · 2021-06-02T21:18:35Z

This sounds sensible. Would you like to open a pr?

benchittle · 2021-06-02T21:49:34Z

Will do!

benchittle · 2021-06-02T21:49:39Z

take

benchittle added Bug Needs Triage Issue that has not been reviewed by a pandas team member labels Jun 2, 2021

github-actions bot assigned benchittle Jun 2, 2021

phofl added Window rolling, ewma, expanding and removed Needs Triage Issue that has not been reviewed by a pandas team member labels Jun 3, 2021

mzeitlin11 added this to the Contributions Welcome milestone Jul 2, 2021

This was referenced Aug 3, 2021

BUG: EWM silently failed float32 #42650

Merged

BUG: pandas EWM fails silently if data types are float32 instead of float64 #42452

Closed

debnathshoham mentioned this issue Aug 4, 2021

BUG: dataframe.rolling along rows drops float16 #42884

Merged

4 tasks

jreback modified the milestones: Contributions Welcome, 1.4 Aug 4, 2021

jreback closed this as completed in #42884 Aug 5, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

BUG: DataFrame.rolling(axis=1) operations drop/ignore float16 and float32 columns #41779

BUG: DataFrame.rolling(axis=1) operations drop/ignore float16 and float32 columns #41779

benchittle commented Jun 2, 2021

INSTALLED VERSIONS

phofl commented Jun 2, 2021

benchittle commented Jun 2, 2021

benchittle commented Jun 2, 2021

BUG: DataFrame.rolling(axis=1) operations drop/ignore float16 and float32 columns #41779

BUG: DataFrame.rolling(axis=1) operations drop/ignore float16 and float32 columns #41779

Comments

benchittle commented Jun 2, 2021

Code Sample, a copy-pastable example

Problem description

Expected Output

Possible Cause

Output of pd.show_versions()

INSTALLED VERSIONS

phofl commented Jun 2, 2021

benchittle commented Jun 2, 2021

benchittle commented Jun 2, 2021

Output of `pd.show_versions()`