Skip to content

pandas._libs.parsers.TextReader('中文文件名-chinese-in-filename.csv') OSError: Initializing from file failed #26074

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
QGB opened this issue Apr 13, 2019 · 6 comments · Fixed by #28498
Labels
good first issue Testing pandas testing functions or related to the test suite

Comments

@QGB
Copy link

QGB commented Apr 13, 2019

kwargs={'delimiter': ',', 'doublequote': True, 'escapechar': None, 'quotechar': '"', 'quoting': 0, 'skipinitialspace': False, 'lineterminator': None, 'header': 0, 'index_col': None, 'names': None, 'skiprows': None, 'na_values': {'nan', '', 'NaN', 'N/A', '-NaN', 'null', '#N/A N/A', 'n/a', '-nan', '-1.#QNAN', 'NA', '1.#IND', '-1.#IND', '1.#QNAN', 'NULL', '#NA', '#N/A'}, 'true_values': None, 'false_values': None, 'converters': {}, 'dtype': None, 'keep_default_na': True, 'thousands': None, 'comment': None, 'decimal': b'.', 'usecols': None, 'verbose': False, 'encoding': 'gb2312', 'compression': None, 'mangle_dupe_cols': True, 'tupleize_cols': False, 'skip_blank_lines': True, 'delim_whitespace': False, 'na_filter': True, 'low_memory': True, 'memory_map': False, 'error_bad_lines': True, 'warn_bad_lines': True, 'float_precision': None, 'na_fvalues': set(), 'allow_leading_cols': True}

parsers.TextReader('中文文件名-chinese-in-filename.csv',**kwargs)

Traceback (most recent call last)
in ()
----> 1 parsers.TextReader(f,**_2)

pandas/_libs/parsers.pyx in pandas._libs.parsers.TextReader.cinit()

pandas/_libs/parsers.pyx in pandas._libs.parsers.TextReader._setup_parser_source()

OSError: Initializing from file failed

@WillAyd
Copy link
Member

WillAyd commented Apr 15, 2019

This isn't part of the API - what error are you getting from user-facing methods?

@WillAyd WillAyd added the Needs Info Clarification about behavior needed to assess issue label Apr 15, 2019
@QGB
Copy link
Author

QGB commented Apr 18, 2019

@willweil


In [70]: pd.read_csv(f, delimiter=',',encoding='gb2312')
---------------------------------------------------------------------------
OSError                                   Traceback (most recent call last)
<ipython-input-70-4d1e02136d2f> in <module>()
----> 1 pd.read_csv(f, delimiter=',',encoding='gb2312')

E:\QGB\Anaconda3\lib\site-packages\pandas\io\parsers.py in parser_f(filepath_or_buffer, sep, delimiter, header, names, index_col, usecols, squeeze, prefix, mangle_dupe_cols, dtype, engine, converters, true_values, false_values, skipinitialspace, skiprows, nrows, na_values, keep_default_na, na_filter, verbose, skip_blank_lines, parse_dates, infer_datetime_format, keep_date_col, date_parser, dayfirst, iterator, chunksize, compression, thousands, decimal, lineterminator, quotechar, quoting, escapechar, comment, encoding, dialect, tupleize_cols, error_bad_lines, warn_bad_lines, skipfooter, doublequote, delim_whitespace, low_memory, memory_map, float_precision)
    676                     skip_blank_lines=skip_blank_lines)
    677
--> 678         return _read(filepath_or_buffer, kwds)
    679
    680     parser_f.__name__ = name

E:\QGB\Anaconda3\lib\site-packages\pandas\io\parsers.py in _read(filepath_or_buffer, kwds)
    438
    439     # Create the parser.
--> 440     parser = TextFileReader(filepath_or_buffer, **kwds)
    441
    442     if chunksize or iterator:

E:\QGB\Anaconda3\lib\site-packages\pandas\io\parsers.py in __init__(self, f, engine, **kwds)
    785             self.options['has_index_names'] = kwds['has_index_names']
    786
--> 787         self._make_engine(self.engine)
    788
    789     def close(self):

E:\QGB\Anaconda3\lib\site-packages\pandas\io\parsers.py in _make_engine(self, engine)
   1012     def _make_engine(self, engine='c'):
   1013         if engine == 'c':
-> 1014             self._engine = CParserWrapper(self.f, **self.options)
   1015         else:
   1016             if engine == 'python':

E:\QGB\Anaconda3\lib\site-packages\pandas\io\parsers.py in __init__(self, src, **kwds)
   1706         kwds['usecols'] = self.usecols
   1707
-> 1708         self._reader = parsers.TextReader(src, **kwds)
   1709
   1710         passed_names = self.names is None

pandas/_libs/parsers.pyx in pandas._libs.parsers.TextReader.__cinit__()

pandas/_libs/parsers.pyx in pandas._libs.parsers.TextReader._setup_parser_source()

OSError: Initializing from file failed

@WillAyd
Copy link
Member

WillAyd commented Apr 18, 2019

Hmm still not able to reproduce. Can you post the output of pd.show_versions? Have you tried on master?

@peterpanmj
Copy link
Contributor

peterpanmj commented Jul 12, 2019

@QGB
It is a bug in previous version. I ran into that in 0.23.4.
I cant reporduce it in 0.24.2 or in current master.

In [4]: import pandas as pd
   ...: df = pd.DataFrame({"A":[1]})
   ...: df.to_csv("./中文.csv")
   ...: pd.read_csv("./中文.csv")

In [5]: pd.__version__
Out[5]: '0.24.2'

Works Fine.

In 0.23.4, you can still work around it by using
pd.read_csv(f, engine="python")

@WillAyd
Copy link
Member

WillAyd commented Sep 6, 2019

I think if someone wanted to add a test case for this can close out

@WillAyd WillAyd added good first issue Testing pandas testing functions or related to the test suite and removed Can't Repro Needs Info Clarification about behavior needed to assess issue labels Sep 6, 2019
@naveenkaushik2504
Copy link
Contributor

I have submitted a PR for this. This is my first contribution, so please guide if anything is not done in the right way. Thanks.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
good first issue Testing pandas testing functions or related to the test suite
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants