Skip to content

[bug] don't remove timezone-awareness when using the method from Dat… #30277

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 5 commits into from
Dec 24, 2019

Conversation

MarcoGorelli
Copy link
Member

@MarcoGorelli MarcoGorelli commented Dec 14, 2019

…aFrame

@MarcoGorelli MarcoGorelli force-pushed the dont-remove-datetimeawareness branch from 7c4e047 to 6357d84 Compare December 14, 2019 18:40
@MarcoGorelli MarcoGorelli force-pushed the dont-remove-datetimeawareness branch 3 times, most recently from 05352dc to 7ccc897 Compare December 14, 2019 19:28
@MarcoGorelli MarcoGorelli changed the title 🐛 don't remove timezone-awareness when using the method from Dat… [WIP] 🐛 don't remove timezone-awareness when using the method from Dat… Dec 14, 2019
@MarcoGorelli MarcoGorelli force-pushed the dont-remove-datetimeawareness branch 3 times, most recently from 6e99e17 to f59cf29 Compare December 15, 2019 08:22
Copy link
Contributor

@jreback jreback left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

append is overly complicated and duplicating many parts of concat.

I think you might be able to remove almost all of 6751-6771, e.g. once its a Series . All of the column stuff is not necessary (I could be wrong but give this a try)

IOW this does the right thing

In [17]: df = pd.DataFrame({'A': pd.date_range('20130101', periods=3, tz='UTC'), 'B': range(3)})                                                                                                                            

In [18]: pd.concat([df, df.iloc[0].to_frame().T.infer_objects()]).dtypes                                                                                                                                                    
Out[18]: 
A    datetime64[ns, UTC]
B                  int64
dtype: object

@jreback jreback added Dtype Conversions Unexpected or buggy dtype conversions Reshaping Concat, Merge/Join, Stack/Unstack, Explode labels Dec 15, 2019
@MarcoGorelli MarcoGorelli force-pushed the dont-remove-datetimeawareness branch 3 times, most recently from 3b5cab7 to 22d3dc9 Compare December 16, 2019 11:27
@MarcoGorelli
Copy link
Member Author

MarcoGorelli commented Dec 16, 2019

All of the column stuff is not necessary (I could be wrong but give this a try)

@jreback Tried this, but it seems it's necessary for pandas/tests/reshape/test_concat.py::TestAppend::test_append_same_columns_type

@MarcoGorelli MarcoGorelli changed the title [WIP] 🐛 don't remove timezone-awareness when using the method from Dat… [bug] don't remove timezone-awareness when using the method from Dat… Dec 18, 2019
@mroeschke
Copy link
Member

@MarcoGorelli this change solved the original issue. Maybe give this a shot:

(pandas-dev) matthewroeschke:pandas-mroeschke matthewroeschke$ git diff
diff --git a/pandas/core/frame.py b/pandas/core/frame.py
index 766437dba..62391b7cc 100644
--- a/pandas/core/frame.py
+++ b/pandas/core/frame.py
@@ -6771,7 +6771,7 @@ class DataFrame(NDFrame):
                 combined_columns = self.columns.astype(object).append(idx_diff)
             other = other.reindex(combined_columns, copy=False)
             other = DataFrame(
-                other.values.reshape((1, len(other))),
+                other._values,
                 index=index,
                 columns=combined_columns,
             )

@MarcoGorelli
Copy link
Member Author

MarcoGorelli commented Dec 23, 2019

@mroeschke I would still need to reshape though, wouldn't I? Else I don't see how

pandas/tests/frame/test_combine_concat.py::TestDataFrameConcatCommon::test_append_series_dict

would pass.

But if I do reshape, then pandas/tests/frame/test_combine_concat.py::TestDataFrameConcatCommon::test_append_timestamps_aware_or_naive fails with

AttributeError: 'DatetimeArray' object has no attribute 'reshape'

And I can't just pass the dtype directly to the constructor, else I'm blocked by #30428

@MarcoGorelli MarcoGorelli force-pushed the dont-remove-datetimeawareness branch from 383694c to f87f987 Compare December 23, 2019 11:23
@mroeschke
Copy link
Member

Extension arrays (DatetimeArray) are still essentially 1D only right now, so reshaping is not possible. As you can see in the test_append_series_dict it's failing the verify_integrety=True test which I think are not really well integrated with extension arrays.

Copy link
Contributor

@jreback jreback left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

small changes. I would also like to test this with categorical, nullable integer and other extension types (likely this fixes).

so if you can file an issue for this (and followup PR would be great).

can also do here if you'd like alternately

@MarcoGorelli MarcoGorelli force-pushed the dont-remove-datetimeawareness branch from d61979d to e5a08c7 Compare December 24, 2019 10:34
@MarcoGorelli
Copy link
Member Author

@mroeschke Thanks for your explanation. What would you like me to do then, use ._value and change test_append_series_dict, or keep as is?

@MarcoGorelli
Copy link
Member Author

@jreback thanks for your review, I've made those changes.

I've opened up another issue for the other dtypes, will work on that next week if no one else has taken it up by then

@jreback jreback added this to the 1.0 milestone Dec 24, 2019
@jreback jreback merged commit fa4949f into pandas-dev:master Dec 24, 2019
@jreback
Copy link
Contributor

jreback commented Dec 24, 2019

thanks!

@MarcoGorelli MarcoGorelli deleted the dont-remove-datetimeawareness branch December 24, 2019 16:08
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Dtype Conversions Unexpected or buggy dtype conversions Reshaping Concat, Merge/Join, Stack/Unstack, Explode
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Appending to DataFrame removes the timezone-awareness
3 participants