Skip to content

BUG: dict_keys cannot be used as pd.read_csv's names parameter #36928

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
3 tasks done
abmyii opened this issue Oct 6, 2020 · 2 comments · Fixed by #36937
Closed
3 tasks done

BUG: dict_keys cannot be used as pd.read_csv's names parameter #36928

abmyii opened this issue Oct 6, 2020 · 2 comments · Fixed by #36937
Labels
IO CSV read_csv, to_csv Regression Functionality that used to work in a prior pandas version
Milestone

Comments

@abmyii
Copy link
Contributor

abmyii commented Oct 6, 2020

  • I have checked that this issue has not already been reported.

  • I have confirmed this bug exists on the latest version of pandas.

  • (optional) I have confirmed this bug exists on the master branch of pandas.


Code Sample, a copy-pastable example

test.csv:

a,b
1,2
import pandas as pd
data = {'a': 10, 'b': 20}
pd.read_csv('test.csv', names=data.keys())

Error:

Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/usr/local/lib/python3.8/dist-packages/pandas/io/parsers.py", line 691, in read_csv
    return _read(filepath_or_buffer, kwds)
  File "/usr/local/lib/python3.8/dist-packages/pandas/io/parsers.py", line 451, in _read
    _validate_names(kwds.get("names", None))
  File "/usr/local/lib/python3.8/dist-packages/pandas/io/parsers.py", line 424, in _validate_names
    raise ValueError("Names should be an ordered collection.")
ValueError: Names should be an ordered collection.

Problem description

When a dict_keys object is passed, the keys aren't used as names. This is because dict_keys isn't list_like (

if not is_list_like(names, allow_sets=False):
) or indexable, so the issue is understandable, but it would be nice to be able to pass dict_keys without issue.

Expected Output

Passing the dict keys to names shouldn't fail.

Output of pd.show_versions()

INSTALLED VERSIONS

commit : 4e55346
python : 3.8.3.final.0
python-bits : 64
OS : Linux
OS-release : 4.15.0-118-generic
Version : #119-Ubuntu SMP Tue Sep 8 12:30:01 UTC 2020
machine : x86_64
processor : x86_64
byteorder : little
LC_ALL : None
LANG : None
LOCALE : en_US.UTF-8

pandas : 1.2.0.dev0+632.g4e553464f
numpy : 1.18.2
pytz : 2019.3
dateutil : 2.8.1
pip : 20.1.1
setuptools : 46.1.3
Cython : None
pytest : None
hypothesis : None
sphinx : 1.6.7
blosc : None
feather : None
xlsxwriter : None
lxml.etree : None
html5lib : 0.999999999
pymysql : None
psycopg2 : None
jinja2 : 2.10
IPython : None
pandas_datareader: None
bs4 : 4.6.0
bottleneck : None
fsspec : None
fastparquet : None
gcsfs : None
matplotlib : 3.2.1
numexpr : None
odfpy : None
openpyxl : None
pandas_gbq : None
pyarrow : None
pytables : None
pyxlsb : None
s3fs : None
scipy : 1.4.1
sqlalchemy : None
tables : None
tabulate : None
xarray : None
xlrd : None
xlwt : None
numba : None

@abmyii abmyii added Bug Needs Triage Issue that has not been reviewed by a pandas team member labels Oct 6, 2020
@abmyii abmyii changed the title BUG: dict.keys() cannot be used as pd.read_csv's names parameter BUG: dict_keys cannot be used as pd.read_csv's names parameter Oct 6, 2020
@dsaxton dsaxton added IO CSV read_csv, to_csv Regression Functionality that used to work in a prior pandas version and removed Bug Needs Triage Issue that has not been reviewed by a pandas team member labels Oct 7, 2020
@dsaxton
Copy link
Member

dsaxton commented Oct 7, 2020

Thanks @abmyii, probably we should be making an exception for ordered sets here. PR welcome.

ref #34956
cc @MJafarMashhadi

@abmyii
Copy link
Contributor Author

abmyii commented Oct 7, 2020

probably we should be making an exception for ordered sets here. PR welcome.

PR submitted, but I don't know of a type which would work for all ordered sets, so I simply used the abc.KeysView which fixes the immediate issue. Is that acceptable or does more work need to be done?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
IO CSV read_csv, to_csv Regression Functionality that used to work in a prior pandas version
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants