BUG: read_csv raising for arrow engine and parse_dates #53295

phofl · 2023-05-18T21:23:26Z

closes #xxxx (Replace xxxx with the GitHub issue number)
Tests added and passed if fixing a bug or adding a new feature
All code checks passed.
Added type annotations to new arguments/methods/functions.
Added an entry in the latest doc/source/whatsnew/vX.X.X.rst file if fixing a bug or adding a new feature.

Not sure if this is an actual regression, but ties a bit into the dtype backend and would be nice if this works, since this raises for every parse_date case with numpy dtype backend (and I want to advertise the engine in my pyarrow blog...)

# Conflicts: # doc/source/whatsnew/v2.0.2.rst

mroeschke · 2023-05-19T00:03:48Z

pandas/io/parsers/base_parser.py

@@ -1137,6 +1137,9 @@ def unpack_if_single_element(arg):
        return arg

    def converter(*date_cols, col: Hashable):
+        if len(date_cols) == 1 and date_cols[0].dtype.kind in "Mm":


Not sure if I understood your comment in the OP correctly, but does this also fix parse_dates for numpy dtypes too?

Only NumPy dtypes as far as I know, we fixed this for Arrow dtypes a couple of weeks ago.

lithomas1 · 2023-05-19T02:08:32Z

Can you share a MRE of the issue you're having?

I'm wondering if #50056 fixes/causes the issue.

phofl · 2023-05-19T08:30:18Z

data = """a,b
2000-01-01 00:00:00,1
2000-01-01 00:00:01,1"""

result = pd.read_csv(StringIO(data), parse_dates=["a"], engine="pyarrow")

Nope, still raises after your PR is in

Edit: The main issue is that arrow infers as datetime and we try to infer again, which raises.

mroeschke · 2023-05-19T18:07:04Z

Thanks @phofl

lumberbot-app · 2023-05-19T18:07:23Z

Owee, I'm MrMeeseeks, Look at me.

There seem to be a conflict, please backport manually. Here are approximate instructions:

Checkout backport branch and update it.

git checkout 2.0.x
git pull

Cherry pick the first parent branch of the this PR on top of the older branch:

git cherry-pick -x -m1 aaf503784a7cbf6425d681551d1e1e686ce14815

You will likely have some merge/cherry-pick conflict here, fix them and commit:

git commit -am 'Backport PR #53295: BUG: read_csv raising for arrow engine and parse_dates'

Push to a named branch:

git push YOURFORK 2.0.x:auto-backport-of-pr-53295-on-2.0.x

Create a PR against branch 2.0.x, I would have named this PR:

"Backport PR #53295 on branch 2.0.x (BUG: read_csv raising for arrow engine and parse_dates)"

And apply the correct labels and milestones.

Congratulations — you did some good work! Hopefully your backport PR will be tested by the continuous integration and merged soon!

Remember to remove the Still Needs Manual Backport label once the PR gets merged.

If these instructions are inaccurate, feel free to suggest an improvement.

phofl · 2023-05-19T18:17:09Z

Will backport later today or tomorrow

lithomas1 · 2023-05-19T18:33:45Z

Might want to consider backporting #50056 and #52087 too.

) (cherry picked from commit aaf5037)

phofl · 2023-05-20T13:52:43Z

The first one is a bit less common, can think about the second one if you like

lithomas1 · 2023-05-20T18:21:58Z

That'd be great, I think there were a couple issues opened about nans in string columns not being read properly related to the second.

…ngine and parse_dates) (#53317) BUG: read_csv raising for arrow engine and parse_dates (#53295) (cherry picked from commit aaf5037)

)

phofl added 2 commits May 18, 2023 23:22

BUG: read_csv raising for arrow engine and parse_dates

4ff9513

Merge remote-tracking branch 'upstream/main' into 53295

01ee19c

# Conflicts: # doc/source/whatsnew/v2.0.2.rst

phofl added IO CSV read_csv, to_csv Arrow pyarrow functionality labels May 18, 2023

mroeschke reviewed May 19, 2023

View reviewed changes

mroeschke added this to the 2.0.2 milestone May 19, 2023

mroeschke approved these changes May 19, 2023

View reviewed changes

mroeschke merged commit aaf5037 into pandas-dev:main May 19, 2023

lumberbot-app bot added the Still Needs Manual Backport label May 19, 2023

phofl deleted the 53295 branch May 19, 2023 18:16

phofl added a commit to phofl/pandas that referenced this pull request May 20, 2023

BUG: read_csv raising for arrow engine and parse_dates (pandas-dev#53295

1d50eb8

) (cherry picked from commit aaf5037)

phofl mentioned this pull request May 20, 2023

Backport PR #53295 on branch 2.0.x (BUG: read_csv raising for arrow engine and parse_dates) #53317

Merged

phofl removed the Still Needs Manual Backport label May 20, 2023

lithomas1 pushed a commit that referenced this pull request May 20, 2023

Backport PR #53295 on branch 2.0.x (BUG: read_csv raising for arrow e…

2b012f0

…ngine and parse_dates) (#53317) BUG: read_csv raising for arrow engine and parse_dates (#53295) (cherry picked from commit aaf5037)

topper-123 pushed a commit to topper-123/pandas that referenced this pull request May 22, 2023

BUG: read_csv raising for arrow engine and parse_dates (pandas-dev#53295

62b6442

)

Daquisu pushed a commit to Daquisu/pandas that referenced this pull request Jul 8, 2023

BUG: read_csv raising for arrow engine and parse_dates (pandas-dev#53295

6ff19d2

)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

BUG: read_csv raising for arrow engine and parse_dates #53295

BUG: read_csv raising for arrow engine and parse_dates #53295

phofl commented May 18, 2023 •

edited

Loading

mroeschke May 19, 2023

phofl May 19, 2023

lithomas1 commented May 19, 2023

phofl commented May 19, 2023 •

edited

Loading

mroeschke commented May 19, 2023

lumberbot-app bot commented May 19, 2023

phofl commented May 19, 2023

lithomas1 commented May 19, 2023

phofl commented May 20, 2023

lithomas1 commented May 20, 2023

BUG: read_csv raising for arrow engine and parse_dates #53295

BUG: read_csv raising for arrow engine and parse_dates #53295

Conversation

phofl commented May 18, 2023 • edited Loading

mroeschke May 19, 2023

Choose a reason for hiding this comment

phofl May 19, 2023

Choose a reason for hiding this comment

lithomas1 commented May 19, 2023

phofl commented May 19, 2023 • edited Loading

mroeschke commented May 19, 2023

lumberbot-app bot commented May 19, 2023

phofl commented May 19, 2023

lithomas1 commented May 19, 2023

phofl commented May 20, 2023

lithomas1 commented May 20, 2023

phofl commented May 18, 2023 •

edited

Loading

phofl commented May 19, 2023 •

edited

Loading