-
-
Notifications
You must be signed in to change notification settings - Fork 18.4k
BUG: Cast pd.NA to pd.NaT in to_datetime #32214
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
I am not fully sure we should do this, since pd.NA and pd.NaT behave differently (and we probably want to have a timestamp dtype that uses pd.NA at some point) |
Ok, I'll close the PR, but should probably leave the issue open for now? |
if the user specifically calls to_datetime, it makes sense that pd.NA would cast to NaT |
@jbrockmendel so the reason I initially didn't add this check to We might want to make missing.pyx independent on the rest of |
@@ -22,6 +22,8 @@ from pandas._libs.tslibs.util cimport ( | |||
get_nat, is_integer_object, is_float_object, is_datetime64_object, | |||
is_timedelta64_object) | |||
|
|||
from pandas._libs.missing cimport C_NA |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@jbrockmendel is this a problem? (the dependency)?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd like to avoid it if possible. The property "tslibs doesnt rely on anything except _config" is really nice for reasoning-about-code purposes.
@dsaxton is it feasible to put pd.NA into a higher-level check than checknull_with_nat?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm, it may be but I think it would have to be before https://github.com/pandas-dev/pandas/blob/master/pandas/core/arrays/datetimes.py#L1844, do you think that looks possible? There's also the "trick" of not importing NA
at all and checking for (val == val) is (val != val)
in checknull_with_nat
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ok, going to merge this as-is. @jbrockmendel or @dsaxton if you can figure a way to decouple this a bit would be great (though since missing is pretty dependency free i actually don't mind this)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@dsaxton im having trouble figuring out which usage of checknull_with_nat is the relevant one here, can you point it out? The only ones I see when grepping are in libmissing.checknull
(which already checks for C_NA, which this renders redundant), and in Period.__new__
, which seems like it shouldn't be relevant.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
looks like i was confusing checknull_with_nat with is_null_datetimelike, never mind
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(this suggests we may have too many null-checking-like functions)
Maybe a file with just NAType, as that doesn't depend on most of the things in libmissing (many of which do depend on tslibs). This seems like it will be necessary if we want e.g. |
Is this something we need for 1.0.2, or can it wait for 1.1.0? if the latter id like to revert and take more time to figure out the dependency structure |
this is a usability issue in 1.0.2 happy to change for 1.1 to a proper structure (this is already merged and backport ed) |
black pandas
git diff upstream/master -u -- "*.py" | flake8 --diff