You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
BUG: Better handling of empty data reads with Python engine
In Python, when reading an empty file, it used to throw a StopIteration
error with no error message. This PR helps to differentiate the case
when no columns are inferable, which now leads to an EmptyDataError for
both the C and Python engines.
[ci skip]
In order to standardize the ``read_csv`` API for both the C and Python engines, both will now raise an
188
+
``EmptyDataError``, a subclass of ``ValueError``, in response to empty columns or header (:issue:`12506`)
189
+
190
+
Previous behaviour:
191
+
192
+
.. code-block:: python
193
+
194
+
In [1]: df = pd.read_csv(StringIO(''), engine='c')
195
+
...
196
+
ValueError: No columns to parse from file
197
+
198
+
In [2]: df = pd.read_csv(StringIO(''), engine='python')
199
+
...
200
+
StopIteration
201
+
202
+
New behaviour:
203
+
204
+
.. code-block:: python
205
+
206
+
In [1]: df = pd.read_csv(StringIO(''), engine='c')
207
+
...
208
+
pandas.io.common.EmptyDataError: No columns to parse from file
209
+
210
+
In [2]: df = pd.read_csv(StringIO(''), engine='python')
211
+
...
212
+
pandas.io.common.EmptyDataError: No columns to parse from file
213
+
214
+
In addition to this error change, several others have been made as well:
215
+
216
+
- ``CParserError`` is now a ``ValueError`` instead of just an ``Exception`` (:issue:`12551`)
217
+
- A ``CParserError`` is now raised instead of a generic ``Exception`` in ``read_csv`` when the C engine cannot parse a column
218
+
- A ``ValueError`` is now raised instead of a generic ``Exception`` in ``read_csv`` when the C engine encounters a ``NaN`` value in an integer column
219
+
- A ``ValueError`` is now raised instead of a generic ``Exception`` in ``read_csv`` when ``true_values`` is specified, and the C engine encounters an element in a column containing unencodable bytes
220
+
- ``pandas.parser.OverflowError`` exception has been removed and has been replaced with Python's built-in ``OverflowError`` exception
0 commit comments