Data corruption in DataFrame when creating new columns while iterating over the index #2453

rafaljozefowicz · 2012-12-07T23:37:59Z

That's the shortest example I could find. Data gets somehow corrupted when creating new columns while iterating over the index. Pre-generating initial values for columns solved my problem (commented out part)
I am using pandas 0.9.0

df = pd.DataFrame(index=[0,1])
df[0] = nan
wasCol = {}
# uncommenting these makes the results match
#for col in xrange(100, 200):
#    wasCol[col] = 1
#    df[col] = nan

for i, dt in enumerate(df.index):
    for col in xrange(100, 200):
        if not col in wasCol:
            wasCol[col] = 1
            df[col] = nan
        df[col][dt] = i

myid = 100
print(len(df.ix[isnan(df[myid]), [myid]]))
print(len(df.ix[isnan(df[myid]), [myid]]))

Outputs:

0
1

The text was updated successfully, but these errors were encountered:

wesm · 2012-12-07T23:46:24Z

Looks fine on the development version. I'll see if I can figure out what was going wrong so I can add a unit test

wesm · 2012-12-08T00:11:29Z

Well, I can verify this was fixed in 5d6e7c8-- the bug is still present in v0.9.1. I would suggest upgrading to 0.10.0 or the development version as soon as practical. I'll add a unit test for your example

wesm closed this as completed in 7e53266 Dec 8, 2012

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Data corruption in DataFrame when creating new columns while iterating over the index #2453

Data corruption in DataFrame when creating new columns while iterating over the index #2453

rafaljozefowicz commented Dec 7, 2012

wesm commented Dec 7, 2012

wesm commented Dec 8, 2012

Data corruption in DataFrame when creating new columns while iterating over the index #2453

Data corruption in DataFrame when creating new columns while iterating over the index #2453

Comments

rafaljozefowicz commented Dec 7, 2012

wesm commented Dec 7, 2012

wesm commented Dec 8, 2012