-
-
Notifications
You must be signed in to change notification settings - Fork 18.5k
BUG: NaT instead of error for timestamp concat with None #53042
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
BUG: NaT instead of error for timestamp concat with None #53042
Conversation
…ne_with_timestamp_raisis_error
Thanks! |
pandas/core/internals/managers.py
Outdated
@@ -2358,7 +2359,7 @@ def _preprocess_slice_or_indexer( | |||
def make_na_array(dtype: DtypeObj, shape: Shape, fill_value) -> ArrayLike: | |||
if isinstance(dtype, DatetimeTZDtype): | |||
# NB: exclude e.g. pyarrow[dt64tz] dtypes | |||
i8values = np.full(shape, fill_value._value) | |||
i8values = np.full(shape, NaT.value) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the reason a test is failing is because in cases where fill_value is not None you need to keep it
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the reason a test is failing is because in cases where fill_value is not None you need to keep it
i8values = np.full(shape, NaT.value) | |
if fill_value is None: | |
i8values = np.full(shape, NaT.value) | |
else: | |
i8values = np.full(shape, fill_value._value) |
- Do you think this change could fulfill that?
- Depreciation warning
I got a deprecation warning while debugging.
FutureWarning: The behavior of DataFrame concatenation with all-NA entries is deprecated. In a future version, this will no longer exclude all-NA columns when determining the result dtypes. To retain the old behavior, cast the all-NA columns to the desired dtype before the concat operation.
So I tried re-running the example by changing None
dtype to datetime64[ns, UTC]
. It worked and didn't raise any errors.
>>> data = [{'A': None}]
>>> df = pd.DataFrame(data, dtype="datetime64[ns, UTC]")
>>> data_2 = [{'A': pd.to_datetime("1990-12-20 00:00:00+00:00")}]
>>> df_2 = pd.DataFrame(data_2)
>>> # df.dtypes
>>> pd.concat([df, df_2])
# o/p
A
0 NaT
0 1990-12-20 00:00:00+00:00
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Try Timestamp(fill_value)._value
Hi, |
…ne_with_timestamp_raisis_error
…ne_with_timestamp_raisis_error
One check is failing. Is there anything that I'm skipping? |
This pull request is stale because it has been open for thirty days with no activity. Please update and respond to this comment if you're still interested in working on this. |
I think unrelated. can you merge main and see if that does it |
@@ -2405,7 +2406,7 @@ def _preprocess_slice_or_indexer( | |||
def make_na_array(dtype: DtypeObj, shape: Shape, fill_value) -> ArrayLike: | |||
if isinstance(dtype, DatetimeTZDtype): | |||
# NB: exclude e.g. pyarrow[dt64tz] dtypes | |||
i8values = np.full(shape, fill_value._value) | |||
i8values = np.full(shape, Timestamp(fill_value)._value) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i think this may mess up if your dtype has a different unit than the fill_value. can you add a test where this is the case? also needs a test for the original bug
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
also needs a test for the original bug
My bad! I forgot to add that. Now added that👍
Thanks for the reminding.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i think this may mess up if your dtype has a different unit than the fill_value. can you add a test where this is the case?
Are you talking about something like this? where the value will be timestamp but its dtype will be different?
df = pd.DataFrame([{'A':None}])
df_2 = pd.DataFrame([{'A': pd.to_datetime("1990-12-20 00:00:00+00:00")}], dtype="int32")
cd = pd.concat([df, df_2])
sorry! if I did not get it correctly. Could you please explain it in detail?
I'm running into issues related to #52093 - it seems like this PR has stalled, and should fix the issue we're hitting. |
can you merge main and ill take another look |
Thanks for the pull request, but it appears to have gone stale. If interested in continuing, please merge in the main branch, address any review comments and/or failing tests, and we can reopen. |
doc/source/whatsnew/vX.X.X.rst
file if fixing a bug or adding a new feature.