Skip to content

Issue with CSV parser when using usecols #3192

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
tr11 opened this issue Mar 27, 2013 · 2 comments · Fixed by #4406
Closed

Issue with CSV parser when using usecols #3192

tr11 opened this issue Mar 27, 2013 · 2 comments · Fixed by #4406
Labels
Bug Dtype Conversions Unexpected or buggy dtype conversions IO CSV read_csv, to_csv IO Data IO issues that don't fit into a more specific label
Milestone

Comments

@tr11
Copy link
Contributor

tr11 commented Mar 27, 2013

The following code:

import pandas
from io import StringIO

data = u'1,2,3\n4,5,6\n7,8,9'
df = pandas.read_csv(StringIO(data), 
                     dtype={'B': int, 'C':float},
                     header=None,
                     names=['A', 'B', 'C'],
                     converters={'A': str},
)
print(df.dtypes)
print
df = pandas.read_csv(StringIO(data), usecols=[0,2],
                     dtype={'B': int, 'C':float},
                     header=None,
                     names=['A', 'B', 'C'],
                     converters={'A': str},
)
print(df.dtypes)
print
df = pandas.read_csv(StringIO(data), usecols=[0,1,2],
                     dtype={'B': int, 'C':float},
                     header=None,
                     names=['A', 'B', 'C'],
                     converters={'A': str},
)
print(df.dtypes)

outputs

A     object
B      int32
C    float64
dtype: object

A     object
C    float64
dtype: object

A    object
B    object
C    object
dtype: object

The last dtype should be the same as the first one. The issue arises when passing in a converters dictionary together with a usecols that uses all the available columns.

The commit https://github.com/tr11/pandas/commit/1ac3700e72a5861a7d8544a72d77a4d64c71f118 in my fork seems to fix this issue.

@garaud
Copy link
Contributor

garaud commented Jun 20, 2013

Hi,

I read your issue by curiosity and I wonder if you sent a 'pull request' of your fix or not. I didn't see it in https://github.com/pydata/pandas/pulls Maybe check if this issue hasn't been fixed already.

Damien G.

@tr11
Copy link
Contributor Author

tr11 commented Jun 22, 2013

Damien,

I didn't send a pull request. I'll do so after checking if the issue still
exists.

Thanks for the reminder,

TR
On Jun 20, 2013 3:40 AM, "Damien Garaud" [email protected] wrote:

Hi,

I read your issue by curiosity and I wonder if you sent a 'pull request'
of your fix or not. I didn't see it in
https://github.com/pydata/pandas/pulls Maybe check if this issue hasn't
been fixed already.

Damien G.


Reply to this email directly or view it on GitHubhttps://github.com//issues/3192#issuecomment-19735395
.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Bug Dtype Conversions Unexpected or buggy dtype conversions IO CSV read_csv, to_csv IO Data IO issues that don't fit into a more specific label
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants