You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Setting a new column with the sum of two existing columns does not work correctly for large dataframes: X['C']=X['A']+X['B'] may or may not work correctly for some runs and frame of size > 10000 X['A'] and X['B'] may contain any float number. This issue is absent in pandas 0.16.2.
I am running 64 bit python on Windows 10. See attached notebook for details:
import pandas as pd
DataFrameSize=10001 ## will work with 10000 or less
XR = pd.DataFrame({'A' : pd.Series(1,index=list(range(DataFrameSize)),dtype='float32'),
'B' : pd.Series(2,index=list(range(DataFrameSize)),dtype='float32')})
def CleanDebug(X):
X.loc[:,'C']=X.loc[:,'A']+X.loc[:,'B']
#X['C']=X['A']+X['B']
return X
for i in xrange(1000):
print 'iteration ',i
ry = CleanDebug(XR)
assert abs(ry.C.sum()-30003)<1
The text was updated successfully, but these errors were encountered:
Setting a new column with the sum of two existing columns does not work correctly for large dataframes: X['C']=X['A']+X['B'] may or may not work correctly for some runs and frame of size > 10000 X['A'] and X['B'] may contain any float number. This issue is absent in pandas 0.16.2.
I am running 64 bit python on Windows 10. See attached notebook for details:
ReadDebug1.zip
To reproduce:
+++++++++++++++++++++++++++++++++++++
import pandas as pd
DataFrameSize=10001 ## will work with 10000 or less
XR = pd.DataFrame({'A' : pd.Series(1,index=list(range(DataFrameSize)),dtype='float32'),
'B' : pd.Series(2,index=list(range(DataFrameSize)),dtype='float32')})
def CleanDebug(X):
X.loc[:,'C']=X.loc[:,'A']+X.loc[:,'B']
#X['C']=X['A']+X['B']
return X
for i in xrange(1000):
print 'iteration ',i
ry = CleanDebug(XR)
assert abs(ry.C.sum()-30003)<1
The text was updated successfully, but these errors were encountered: