Skip to content

Commit 2b012f0

Browse files
authored
Backport PR #53295 on branch 2.0.x (BUG: read_csv raising for arrow engine and parse_dates) (#53317)
BUG: read_csv raising for arrow engine and parse_dates (#53295) (cherry picked from commit aaf5037)
1 parent 092ef68 commit 2b012f0

File tree

3 files changed

+24
-1
lines changed

3 files changed

+24
-1
lines changed

doc/source/whatsnew/v2.0.2.rst

+1-1
Original file line numberDiff line numberDiff line change
@@ -29,6 +29,7 @@ Bug fixes
2929
- Bug in :func:`api.interchange.from_dataframe` was returning :class:`DataFrame`'s of incorrect sizes when called on slices (:issue:`52824`)
3030
- Bug in :func:`api.interchange.from_dataframe` was unnecessarily raising on bitmasks (:issue:`49888`)
3131
- Bug in :func:`merge` when merging on datetime columns on different resolutions (:issue:`53200`)
32+
- Bug in :func:`read_csv` raising ``OverflowError`` for ``engine="pyarrow"`` and ``parse_dates`` set (:issue:`53295`)
3233
- Bug in :func:`to_datetime` was inferring format to contain ``"%H"`` instead of ``"%I"`` if date contained "AM" / "PM" tokens (:issue:`53147`)
3334
- Bug in :meth:`DataFrame.convert_dtypes` ignores ``convert_*`` keywords when set to False ``dtype_backend="pyarrow"`` (:issue:`52872`)
3435
- Bug in :meth:`DataFrame.sort_values` raising for PyArrow ``dictionary`` dtype (:issue:`53232`)
@@ -37,7 +38,6 @@ Bug fixes
3738
- Bug in :meth:`pd.array` raising for ``NumPy`` array and ``pa.large_string`` or ``pa.large_binary`` (:issue:`52590`)
3839
- Bug in :meth:`DataFrame.__getitem__` not preserving dtypes for :class:`MultiIndex` partial keys (:issue:`51895`)
3940
-
40-
4141
.. ---------------------------------------------------------------------------
4242
.. _whatsnew_202.other:
4343

pandas/io/parsers/base_parser.py

+3
Original file line numberDiff line numberDiff line change
@@ -1120,6 +1120,9 @@ def unpack_if_single_element(arg):
11201120
return arg
11211121

11221122
def converter(*date_cols, col: Hashable):
1123+
if len(date_cols) == 1 and date_cols[0].dtype.kind in "Mm":
1124+
return date_cols[0]
1125+
11231126
if date_parser is lib.no_default:
11241127
strs = parsing.concat_date_cols(date_cols)
11251128
date_fmt = (

pandas/tests/io/parser/test_parse_dates.py

+20
Original file line numberDiff line numberDiff line change
@@ -2218,3 +2218,23 @@ def test_parse_dates_dict_format_index(all_parsers):
22182218
index=Index([Timestamp("2019-12-31"), Timestamp("2020-12-31")], name="a"),
22192219
)
22202220
tm.assert_frame_equal(result, expected)
2221+
2222+
2223+
def test_parse_dates_arrow_engine(all_parsers):
2224+
# GH#53295
2225+
parser = all_parsers
2226+
data = """a,b
2227+
2000-01-01 00:00:00,1
2228+
2000-01-01 00:00:01,1"""
2229+
2230+
result = parser.read_csv(StringIO(data), parse_dates=["a"])
2231+
expected = DataFrame(
2232+
{
2233+
"a": [
2234+
Timestamp("2000-01-01 00:00:00"),
2235+
Timestamp("2000-01-01 00:00:01"),
2236+
],
2237+
"b": 1,
2238+
}
2239+
)
2240+
tm.assert_frame_equal(result, expected)

0 commit comments

Comments
 (0)