Skip to content

Unicode not acceptable input for usecols kwarg in read_csv() #13219

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
PeterKucirek opened this issue May 18, 2016 · 4 comments
Closed

Unicode not acceptable input for usecols kwarg in read_csv() #13219

PeterKucirek opened this issue May 18, 2016 · 4 comments
Labels
Bug IO CSV read_csv, to_csv Unicode Unicode strings
Milestone

Comments

@PeterKucirek
Copy link

I caught this bug while updating from version 0.18.0 to 0.18.1. The kwarg usecols no longer accepts unicode column labels.

Example below:

from io import StringIO
import pandas as pd

s = u'''AAA,BBB,CCC,DDD
0.056674973,8,True,a
2.613230982,2,False,b
3.568935038,7,False,a
'''
buff = StringIO(s)
print pd.read_csv(buff, usecols=[u'AAA', u'BBB'])

>> ValueError: The elements of 'usecols' must either be all strings or all integers

I note that 0.18.1 introduced the requirement that usecols be all string or all ints. This makes sense but it looks like the implementation also throws away unicode strings.

@jreback
Copy link
Contributor

jreback commented May 18, 2016

makes sense, pull-requests welcome to fix

cc @gfyoung

@jreback jreback added this to the 0.18.2 milestone May 18, 2016
@jreback jreback added the IO CSV read_csv, to_csv label May 18, 2016
@gfyoung
Copy link
Member

gfyoung commented May 18, 2016

@PeterKucirek: good catch! PR is definitely in order here. Shouldn't be too bad to implement.

@geraldstanje
Copy link

geraldstanje commented Oct 14, 2016

hi,

i get the same error:

ValueError: The elements of 'usecols' must either be all strings, all unicode, o
r all integers

i pandas with the following version:

C:\>pip install --upgrade pandas
Collecting pandas
  Downloading pandas-0.19.0-cp27-cp27m-win_amd64.whl (7.0MB)
    100% |################################| 7.1MB 193kB/s
Collecting pytz>=2011k (from pandas)
  Downloading pytz-2016.7-py2.py3-none-any.whl (480kB)
    100% |################################| 481kB 1.4MB/s
Collecting numpy>=1.7.0 (from pandas)
  Downloading numpy-1.11.2-cp27-none-win_amd64.whl (7.4MB)
    100% |################################| 7.4MB 184kB/s
Requirement already up-to-date: python-dateutil in c:\users\ger\appdata\loca
l\continuum\anaconda2\lib\site-packages (from pandas)
Requirement already up-to-date: six>=1.5 in c:\users\ger\appdata\local\conti
nuum\anaconda2\lib\site-packages (from python-dateutil->pandas)
Installing collected packages: pytz, numpy, pandas
  Found existing installation: pytz 2016.6.1
    DEPRECATION: Uninstalling a distutils installed project (pytz) has been depr
ecated and will be removed in a future version. This is due to the fact that uni
nstalling a distutils project will only partially uninstall the project.
    Uninstalling pytz-2016.6.1:
      Successfully uninstalled pytz-2016.6.1
  Found existing installation: numpy 1.11.1
    Uninstalling numpy-1.11.1:
      Successfully uninstalled numpy-1.11.1
  Found existing installation: pandas 0.18.1
    Uninstalling pandas-0.18.1:
      Successfully uninstalled pandas-0.18.1
Successfully installed numpy-1.11.2 pandas-0.19.0 pytz-2016.7

@gfyoung
Copy link
Member

gfyoung commented Oct 14, 2016

@geraldstanje : Can you show us the code snippet that broke?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Bug IO CSV read_csv, to_csv Unicode Unicode strings
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants