You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
importpandasaspddata=pd.DataFrame({'INT': [-1, 0, 10, 9],
'FLOAT': [-0.148, 0.2347, 38.237, 12.2233]},
index=pd.date_range("20180101 00:00", periods=4))
print('Original data:')
print(data.head())
print('\nThis is probably not a bug but my misunderstanding:')
print('(So how would I apply "clip_upper" inplace on parts of the dataframe?)')
data.loc[[True, True, True, False], ['INT']].clip_upper(8, inplace=True)
print(data.head())
# I used then:# data.loc[[True, True, True, False], ['INT']] = data.loc[[True, True, True, False], ['INT']].clip_upper(8) print('\nIt seems that clip_upper does not preserve the dtypes:')
print(data.clip_upper(8).head())
print('\nSame for inplace:')
data.clip_upper(8, inplace=True)
print(data.head())
Output of this code:
Original data:
INT FLOAT
2018-01-01 -1 -0.1480
2018-01-02 0 0.2347
2018-01-03 10 38.2370
2018-01-04 9 12.2233
(A) This is probably not a bug but my misunderstanding:
(So how would I apply "clip_upper" inplace on parts of the dataframe?)
INT FLOAT
2018-01-01 -1 -0.1480
2018-01-02 0 0.2347
2018-01-03 10 38.2370
2018-01-04 9 12.2233
(B) It seems that clip_upper does not preserve the dtypes:
INT FLOAT
2018-01-01 -1.0 -0.1480
2018-01-02 0.0 0.2347
2018-01-03 8.0 8.0000
2018-01-04 8.0 8.0000
(C) Same for inplace:
INT FLOAT
2018-01-01 -1.0 -0.1480
2018-01-02 0.0 0.2347
2018-01-03 8.0 8.0000
2018-01-04 8.0 8.0000
Problem description
clip_upper with int- and float- columns convert int-column to float.
Calling data.clip_upper(10) with an integer, I would expect that it leaves the int-column as integers and the float-column as float. However, it converts everything to float. (see (B) and (C))
Moreover, clip_upper with inplace=True does not work with .loc but this might as well be me understanding the concept wrong... (see (A))
In [16]: dataOut[16]:
intfloat012.0134.0In [17]: axes_dict=data._construct_axes_dict()
['index', 'columns'] None {'index': RangeIndex(start=0, stop=2, step=1), 'columns': Index(['int', 'float'], dtype='object')}
In [18]: result=data._constructor(data.values, **axes_dict).__finalize__(data)
In [19]: resultOut[19]:
intfloat01.02.013.04.0
Underlying problem is constructor method which is casting dtype of int column to float.
I will take a look at working of property decorator and create a PR
Code Sample
Output of this code:
Problem description
clip_upper
with int- and float- columns convert int-column to float.Calling
data.clip_upper(10)
with an integer, I would expect that it leaves the int-column as integers and the float-column as float. However, it converts everything to float. (see (B) and (C))Moreover,
clip_upper
withinplace=True
does not work with.loc
but this might as well be me understanding the concept wrong... (see (A))Same for
clip_lower
.Expected Output
For (A):
For (B) and (C):
Output of
pd.show_versions()
INSTALLED VERSIONS
commit: None
pandas: 0.23.4
pytest: 4.0.1
pip: 18.1
setuptools: 40.6.2
Cython: 0.29
numpy: 1.15.4
scipy: 1.1.0
pyarrow: None
xarray: None
IPython: 7.2.0
sphinx: 1.8.2
patsy: None
dateutil: 2.7.5
pytz: 2018.7
blosc: None
bottleneck: None
tables: None
numexpr: None
feather: None
matplotlib: 3.0.1
openpyxl: 2.5.11
xlrd: 1.1.0
xlwt: 1.3.0
xlsxwriter: 1.1.2
lxml: 4.2.5
bs4: 4.6.3
html5lib: 1.0.1
sqlalchemy: 1.2.14
pymysql: None
psycopg2: None
jinja2: 2.10
s3fs: None
fastparquet: None
pandas_gbq: None
pandas_datareader: None
The text was updated successfully, but these errors were encountered: