Skip to content

column name dedupe when already deduped returns "column_name.1.1" #27021

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
boldloop opened this issue Jun 24, 2019 · 1 comment
Closed

column name dedupe when already deduped returns "column_name.1.1" #27021

boldloop opened this issue Jun 24, 2019 · 1 comment
Labels
Bug IO CSV read_csv, to_csv Needs Discussion Requires discussion from core team before further action

Comments

@boldloop
Copy link

This isn't necessarily a bug, just something I think could be handled slightly cleaner. The code:

from io import StringIO
csv = StringIO('a,a.1,a')
pd.read_csv(csv)

returns

a	a.1	a.1.1

where I think it could return

a	a.1	a.2

I encountered this when working with data that included the columns ["Column name", "COLUMN NAME", "COLUMN NAME"]. I work in multiple steps (i.e., separating each task into a separate script that takes the output from the previous task—works for auditability), so after the first step the columns became ["Column name", "COLUMN NAME", "COLUMN NAME.1"]. Down the pipeline, I renamed all columns to use all caps for consistency, but I ended up with ["COLUMN NAME.1.1", "COLUMN NAME", "COLUMN NAME.1"]. Not a huge problem, but a little bit annoying nonetheless.

I'm going to dig in and see if I can implement this later this week in a pull request—just filing it now so that I don't forget.

@mroeschke mroeschke added the IO CSV read_csv, to_csv label Nov 2, 2019
@mroeschke mroeschke added Bug Needs Discussion Requires discussion from core team before further action labels Jul 10, 2021
@phofl
Copy link
Member

phofl commented Nov 27, 2021

duplicate of #14704

@phofl phofl closed this as completed Nov 27, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Bug IO CSV read_csv, to_csv Needs Discussion Requires discussion from core team before further action
Projects
None yet
Development

No branches or pull requests

3 participants