Skip to content

Commit cbfce38

Browse files
authored
Revert "ENH: Add Arrow CSV Reader (#43072)"
This reverts commit 44e8822.
1 parent 9a81226 commit cbfce38

37 files changed

+40
-583
lines changed

asv_bench/benchmarks/io/csv.py

+2-2
Original file line numberDiff line numberDiff line change
@@ -206,7 +206,7 @@ def time_read_csv(self, bad_date_value):
206206
class ReadCSVSkipRows(BaseIO):
207207

208208
fname = "__test__.csv"
209-
params = ([None, 10000], ["c", "python", "pyarrow"])
209+
params = ([None, 10000], ["c", "python"])
210210
param_names = ["skiprows", "engine"]
211211

212212
def setup(self, skiprows, engine):
@@ -320,7 +320,7 @@ def time_read_csv_python_engine(self, sep, decimal, float_precision):
320320

321321

322322
class ReadCSVEngine(StringIORewind):
323-
params = ["c", "python", "pyarrow"]
323+
params = ["c", "python"]
324324
param_names = ["engine"]
325325

326326
def setup(self, engine):

doc/source/user_guide/io.rst

+8-46
Original file line numberDiff line numberDiff line change
@@ -160,15 +160,9 @@ dtype : Type name or dict of column -> type, default ``None``
160160
(unsupported with ``engine='python'``). Use ``str`` or ``object`` together
161161
with suitable ``na_values`` settings to preserve and
162162
not interpret dtype.
163-
engine : {``'c'``, ``'python'``, ``'pyarrow'``}
164-
Parser engine to use. The C and pyarrow engines are faster, while the python engine
165-
is currently more feature-complete. Multithreading is currently only supported by
166-
the pyarrow engine.
167-
168-
.. versionadded:: 1.4.0
169-
170-
The "pyarrow" engine was added as an *experimental* engine, and some features
171-
are unsupported, or may not work correctly, with this engine.
163+
engine : {``'c'``, ``'python'``}
164+
Parser engine to use. The C engine is faster while the Python engine is
165+
currently more feature-complete.
172166
converters : dict, default ``None``
173167
Dict of functions for converting values in certain columns. Keys can either be
174168
integers or column labels.
@@ -1628,17 +1622,11 @@ Specifying ``iterator=True`` will also return the ``TextFileReader`` object:
16281622
Specifying the parser engine
16291623
''''''''''''''''''''''''''''
16301624

1631-
Pandas currently supports three engines, the C engine, the python engine, and an experimental
1632-
pyarrow engine (requires the ``pyarrow`` package). In general, the pyarrow engine is fastest
1633-
on larger workloads and is equivalent in speed to the C engine on most other workloads.
1634-
The python engine tends to be slower than the pyarrow and C engines on most workloads. However,
1635-
the pyarrow engine is much less robust than the C engine, which lacks a few features compared to the
1636-
Python engine.
1637-
1638-
Where possible, pandas uses the C parser (specified as ``engine='c'``), but it may fall
1639-
back to Python if C-unsupported options are specified.
1640-
1641-
Currently, options unsupported by the C and pyarrow engines include:
1625+
Under the hood pandas uses a fast and efficient parser implemented in C as well
1626+
as a Python implementation which is currently more feature-complete. Where
1627+
possible pandas uses the C parser (specified as ``engine='c'``), but may fall
1628+
back to Python if C-unsupported options are specified. Currently, C-unsupported
1629+
options include:
16421630

16431631
* ``sep`` other than a single character (e.g. regex separators)
16441632
* ``skipfooter``
@@ -1647,32 +1635,6 @@ Currently, options unsupported by the C and pyarrow engines include:
16471635
Specifying any of the above options will produce a ``ParserWarning`` unless the
16481636
python engine is selected explicitly using ``engine='python'``.
16491637

1650-
Options that are unsupported by the pyarrow engine which are not covered by the list above include:
1651-
1652-
* ``float_precision``
1653-
* ``chunksize``
1654-
* ``comment``
1655-
* ``nrows``
1656-
* ``thousands``
1657-
* ``memory_map``
1658-
* ``dialect``
1659-
* ``warn_bad_lines``
1660-
* ``error_bad_lines``
1661-
* ``on_bad_lines``
1662-
* ``delim_whitespace``
1663-
* ``quoting``
1664-
* ``lineterminator``
1665-
* ``converters``
1666-
* ``decimal``
1667-
* ``iterator``
1668-
* ``dayfirst``
1669-
* ``infer_datetime_format``
1670-
* ``verbose``
1671-
* ``skipinitialspace``
1672-
* ``low_memory``
1673-
1674-
Specifying these options with ``engine='pyarrow'`` will raise a ``ValueError``.
1675-
16761638
.. _io.remote:
16771639

16781640
Reading/writing remote files

doc/source/whatsnew/v1.4.0.rst

+3-6
Original file line numberDiff line numberDiff line change
@@ -78,13 +78,10 @@ Styler
7878

7979
There are also bug fixes and deprecations listed below.
8080

81-
.. _whatsnew_140.enhancements.pyarrow_csv_engine:
81+
.. _whatsnew_140.enhancements.enhancement2:
8282

83-
Multithreaded CSV reading with a new CSV Engine based on pyarrow
84-
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
85-
86-
:func:`pandas.read_csv` now accepts ``engine="pyarrow"`` (requires at least ``pyarrow`` 0.17.0) as an argument, allowing for faster csv parsing on multicore machines
87-
with pyarrow installed. See the :doc:`I/O docs </user_guide/io>` for more info. (:issue:`23697`)
83+
enhancement2
84+
^^^^^^^^^^^^
8885

8986
.. _whatsnew_140.enhancements.other:
9087

pandas/io/parsers/arrow_parser_wrapper.py

-138
This file was deleted.

0 commit comments

Comments
 (0)