Skip to content

Test to datetime null to NaT #45512

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 13 commits into from
Feb 1, 2022

Conversation

ryangilmour
Copy link
Contributor

@ryangilmour ryangilmour commented Jan 20, 2022

@mroeschke mroeschke added the Testing pandas testing functions or related to the test suite label Jan 20, 2022
@mroeschke
Copy link
Member

Looks like there's typing errors:

pandas/tests/tools/test_to_datetime.py:1074: error: List item 0 has incompatible type "None"; expected "NaTType"  [list-item]

("input", "expected"),
(
(
Series([NaT] * 200 + [None] * 200, dtype="object"),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Mind using less values here? (20/40 should be good)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

don't we have a predefined breakpoitn with 50 values?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, but IIRC that's needed when specifically testing caching behavior

@ryangilmour
Copy link
Contributor Author

ryangilmour commented Jan 21, 2022

@mroeschke - thanks for the review. I'll reduce the size of the test series. I've now got the dev environment set up to run mypy locally now. But I'm unsure how to change the code to avoid typing errors.

Mypy complains with either of the following Series objects because they have an incompatible type.

Series([NaT] * 20 + [None] * 20, dtype="object")
Series([None] * 20 + [NaT] * 20, dtype="object")

Possible fixes:

  • use # type: ignore
    • not seen this in any other tests, so doesn't seem appropriate
  • ignore the test case
    • is it not a valid test case?
    • it is a reasonable/possible input to the to_datetime() function, so I feel it ought to be tested
  • something else?

@mroeschke
Copy link
Member

I would just use # type: ignore[list-item] for now, and if you're interested you can look into adjusting the typing of to_datetime in a follow up PR.

@jreback jreback added this to the 1.5 milestone Jan 21, 2022
@jreback jreback added the Missing-data np.nan, pd.NaT, pd.NA, dropna, isnull, interpolate label Jan 23, 2022
@jreback
Copy link
Contributor

jreback commented Jan 23, 2022

I think we DO need to test a larger input (as well) here e.g. 'It only occurs if the input is large enough to force the caching mechanism.'

@jreback
Copy link
Contributor

jreback commented Jan 31, 2022

@ryangilmour can you merge master and address comments

@ryangilmour
Copy link
Contributor Author

Yep - i'll get round to this in the next few days.

@jreback jreback merged commit 2ff7dc0 into pandas-dev:main Feb 1, 2022
@jreback
Copy link
Contributor

jreback commented Feb 1, 2022

thanks @ryangilmour

phofl pushed a commit to phofl/pandas that referenced this pull request Feb 14, 2022
yehoshuadimarsky pushed a commit to yehoshuadimarsky/pandas that referenced this pull request Jul 13, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Missing-data np.nan, pd.NaT, pd.NA, dropna, isnull, interpolate Testing pandas testing functions or related to the test suite
Projects
None yet
Development

Successfully merging this pull request may close these issues.

BUG: pd.to_datetime() raises InvalidIndexError with Null-like arguments
3 participants