-
-
Notifications
You must be signed in to change notification settings - Fork 18.4k
Allow duplicate columns in df.to_csv #3095
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
why don't push to 0.12? (if its really an issue, use can pass legacy=True) |
it's a regression. 10.1 handles this and the default to_csv in 0.11 doesn't. |
ok....you can take out the fail early and just put a try except on the chunk writer then....it will still catch the dup across blocks, but won't fail on the dup in a single block (it will fail because the colnamemap will have None as an indexer rather than a value) |
is there a single dtype dup columns test? I can pu the above fix in... |
Can you give me a recipe for generating blocks so ordered traversal doesn't match column order? |
In [11]: df=mkdf(10,5)
...: df['j'] = pd.Series(range(len(df)))
...: df['k']= pd.Series(map(float,range(len(df))))
...: df['l']= pd.Series(map(str,range(len(df))))
...: df._consolidate_inplace()
...: #df.columns = ['a'] * len(df.columns)
...: bs=df._data.blocks
...: bs[0]=df._data.blocks[0]
In [12]: bs
Out[12]:
[FloatBlock: [j, k], 2 x 10, dtype float64,
ObjectBlock: [C_l0_g0, C_l0_g1, C_l0_g2, C_l0_g3, C_l0_g4, l], 6 x 10, dtype object]
In [9]: df
Out[9]:
C0 C_l0_g0 C_l0_g1 C_l0_g2 C_l0_g3 C_l0_g4 j k l
R0
R_l0_g0 R0C0 R0C1 R0C2 R0C3 R0C4 NaN NaN NaN
R_l0_g1 R1C0 R1C1 R1C2 R1C3 R1C4 NaN NaN NaN
R_l0_g2 R2C0 R2C1 R2C2 R2C3 R2C4 NaN NaN NaN
R_l0_g3 R3C0 R3C1 R3C2 R3C3 R3C4 NaN NaN NaN
R_l0_g4 R4C0 R4C1 R4C2 R4C3 R4C4 NaN NaN NaN
R_l0_g5 R5C0 R5C1 R5C2 R5C3 R5C4 NaN NaN NaN
R_l0_g6 R6C0 R6C1 R6C2 R6C3 R6C4 NaN NaN NaN
R_l0_g7 R7C0 R7C1 R7C2 R7C3 R7C4 NaN NaN NaN
R_l0_g8 R8C0 R8C1 R8C2 R8C3 R8C4 NaN NaN NaN
R_l0_g9 R9C0 R9C1 R9C2 R9C3 R9C4 NaN NaN NaN |
I don't think there is. I've got it going, if there are dupe columns it falls back to using icol |
len(union of set(keys of each block)) == sum(len(set(keys of b)) for b in blocks) |
ok great |
Ok, fixed in master. |
great, that looks good; i'll leave the other issue open about creating a duplicate indexer that can handle this more general case (but don't hold thy breath) |
ok...reopening this so I remember to do it (or @y-p if you want).. |
actually..let me create another one in 0.11.1 |
Continuing #3059.
See also #3092
The text was updated successfully, but these errors were encountered: