BUG: NaT instead of error for timestamp concat with None #53042

srkds · 2023-05-02T18:58:56Z

closes BUG: AttributeError raised with pd.concat between a None and Timestamp #52093
Tests added and passed if fixing a bug or adding a new feature
All code checks passed.
Added type annotations to new arguments/methods/functions.
Added an entry in the latest doc/source/whatsnew/vX.X.X.rst file if fixing a bug or adding a new feature.

…ne_with_timestamp_raisis_error

srkds · 2023-05-03T18:01:40Z

~~I missed adding an entry in the whatsnew file. So will add that,~~ any other suggestions?

Thanks!

jbrockmendel · 2023-05-03T22:56:03Z

pandas/core/internals/managers.py

@@ -2358,7 +2359,7 @@ def _preprocess_slice_or_indexer(
 def make_na_array(dtype: DtypeObj, shape: Shape, fill_value) -> ArrayLike:
    if isinstance(dtype, DatetimeTZDtype):
        # NB: exclude e.g. pyarrow[dt64tz] dtypes
-        i8values = np.full(shape, fill_value._value)
+        i8values = np.full(shape, NaT.value)


the reason a test is failing is because in cases where fill_value is not None you need to keep it

the reason a test is failing is because in cases where fill_value is not None you need to keep it

Suggested change

i8values = np.full(shape, NaT.value)

if fill_value is None:

i8values = np.full(shape, NaT.value)

else:

i8values = np.full(shape, fill_value._value)

Do you think this change could fulfill that?

Depreciation warning
I got a deprecation warning while debugging.

FutureWarning: The behavior of DataFrame concatenation with all-NA entries is deprecated. In a future version, this will no longer exclude all-NA columns when determining the result dtypes. To retain the old behavior, cast the all-NA columns to the desired dtype before the concat operation.

So I tried re-running the example by changing None dtype to datetime64[ns, UTC]. It worked and didn't raise any errors.

>>> data = [{'A': None}] >>> df = pd.DataFrame(data, dtype="datetime64[ns, UTC]") >>> data_2 = [{'A': pd.to_datetime("1990-12-20 00:00:00+00:00")}] >>> df_2 = pd.DataFrame(data_2) >>> # df.dtypes >>> pd.concat([df, df_2]) # o/p A 0 NaT 0 1990-12-20 00:00:00+00:00

Try Timestamp(fill_value)._value

srkds · 2023-05-08T14:55:13Z

Hi,
Any suggestions or changes from my side?

…ne_with_timestamp_raisis_error

srkds · 2023-05-10T15:04:26Z

One check is failing. Is there anything that I'm skipping?

github-actions · 2023-06-12T00:05:46Z

This pull request is stale because it has been open for thirty days with no activity. Please update and respond to this comment if you're still interested in working on this.

jbrockmendel · 2023-06-24T20:02:17Z

One check is failing. Is there anything that I'm skipping?

I think unrelated. can you merge main and see if that does it

jbrockmendel · 2023-06-24T20:03:27Z

pandas/core/internals/managers.py

@@ -2405,7 +2406,7 @@ def _preprocess_slice_or_indexer(
 def make_na_array(dtype: DtypeObj, shape: Shape, fill_value) -> ArrayLike:
    if isinstance(dtype, DatetimeTZDtype):
        # NB: exclude e.g. pyarrow[dt64tz] dtypes
-        i8values = np.full(shape, fill_value._value)
+        i8values = np.full(shape, Timestamp(fill_value)._value)


i think this may mess up if your dtype has a different unit than the fill_value. can you add a test where this is the case? also needs a test for the original bug

also needs a test for the original bug

My bad! I forgot to add that. Now added that👍
Thanks for the reminding.

i think this may mess up if your dtype has a different unit than the fill_value. can you add a test where this is the case?

Are you talking about something like this? where the value will be timestamp but its dtype will be different?

df = pd.DataFrame([{'A':None}]) df_2 = pd.DataFrame([{'A': pd.to_datetime("1990-12-20 00:00:00+00:00")}], dtype="int32") cd = pd.concat([df, df_2])

sorry! if I did not get it correctly. Could you please explain it in detail?

aph3rson · 2023-07-21T19:29:04Z

I'm running into issues related to #52093 - it seems like this PR has stalled, and should fix the issue we're hitting.
@jbrockmendel is there anything else needed for this PR, or can this be merged?

jbrockmendel · 2023-07-22T03:51:15Z

can you merge main and ill take another look

mroeschke · 2023-08-01T17:27:29Z

Thanks for the pull request, but it appears to have gone stale. If interested in continuing, please merge in the main branch, address any review comments and/or failing tests, and we can reopen.

srkds added 2 commits May 3, 2023 00:25

BUG: NaT instead of error for timestamp

d75762d

Merge remote-tracking branch 'upstream/main' into bug/52093/concat_no…

e17bd97

…ne_with_timestamp_raisis_error

jbrockmendel reviewed May 3, 2023

View reviewed changes

mroeschke added Missing-data np.nan, pd.NaT, pd.NA, dropna, isnull, interpolate Reshaping Concat, Merge/Join, Stack/Unstack, Explode labels May 4, 2023

srkds added 4 commits May 8, 2023 23:12

Merge remote-tracking branch 'upstream/main' into bug/52093/concat_no…

8804de4

…ne_with_timestamp_raisis_error

Timestamp

c1c5675

Merge remote-tracking branch 'upstream/main' into bug/52093/concat_no…

634e522

…ne_with_timestamp_raisis_error

whatsnew entry

4171a7c

srkds requested a review from jbrockmendel May 12, 2023 15:24

github-actions bot added the Stale label Jun 12, 2023

jbrockmendel reviewed Jun 24, 2023

View reviewed changes

srkds added 2 commits June 26, 2023 14:41

Resolved merge conflict: accepted both changes

11db8f1

added actual bug test case

416f4a8

mroeschke closed this Aug 1, 2023

yuanx749 mentioned this pull request Aug 5, 2023

BUG: fix AttributeError raised with pd.concat between a None and timezone-aware Timestamp #54428

Merged

4 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

BUG: NaT instead of error for timestamp concat with None #53042

BUG: NaT instead of error for timestamp concat with None #53042

srkds commented May 2, 2023 •

edited

Loading

srkds commented May 3, 2023 •

edited

Loading

jbrockmendel May 3, 2023

srkds May 4, 2023

jbrockmendel May 8, 2023

srkds commented May 8, 2023

srkds commented May 10, 2023

github-actions bot commented Jun 12, 2023

jbrockmendel commented Jun 24, 2023

jbrockmendel Jun 24, 2023

srkds Jun 26, 2023

srkds Jun 26, 2023

aph3rson commented Jul 21, 2023

jbrockmendel commented Jul 22, 2023

mroeschke commented Aug 1, 2023

BUG: NaT instead of error for timestamp concat with None #53042

BUG: NaT instead of error for timestamp concat with None #53042

Conversation

srkds commented May 2, 2023 • edited Loading

srkds commented May 3, 2023 • edited Loading

jbrockmendel May 3, 2023

Choose a reason for hiding this comment

srkds May 4, 2023

Choose a reason for hiding this comment

jbrockmendel May 8, 2023

Choose a reason for hiding this comment

srkds commented May 8, 2023

srkds commented May 10, 2023

github-actions bot commented Jun 12, 2023

jbrockmendel commented Jun 24, 2023

jbrockmendel Jun 24, 2023

Choose a reason for hiding this comment

srkds Jun 26, 2023

Choose a reason for hiding this comment

srkds Jun 26, 2023

Choose a reason for hiding this comment

aph3rson commented Jul 21, 2023

jbrockmendel commented Jul 22, 2023

mroeschke commented Aug 1, 2023

srkds commented May 2, 2023 •

edited

Loading

srkds commented May 3, 2023 •

edited

Loading