Skip to content

BUG: Fix concat series loss of timezone #24027

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 25 commits into from
Dec 5, 2018
Merged
Changes from 3 commits
Commits
Show all changes
25 commits
Select commit Hold shift + click to select a range
d80afa0
BUG: Fix concat series loss of timezone
jakezimmer Nov 30, 2018
f4b751d
Merge branch 'master' of https://github.com/pandas-dev/pandas
evangelinehl Dec 1, 2018
159c4e6
Fixed naming error for is_datetimetz since this function is no longer…
evangelinehl Dec 1, 2018
2450097
Attempted to use _concat_compat to rectify the timezone bug
jakezimmer Dec 1, 2018
9cb20c4
Merge remote-tracking branch 'origin/master'
jakezimmer Dec 1, 2018
7f9dd52
Attempt to fix tz error with concat compat instead of union
jakezimmer Dec 1, 2018
6cb2022
changing behavior to be based on tz
jakezimmer Dec 1, 2018
a4da449
Attempting to fix differing dimensions bug
evangelinehl Dec 2, 2018
fe83e6d
Another attempt to fix dimensions bug
evangelinehl Dec 2, 2018
2cbb533
Just trying to test different versions here
evangelinehl Dec 2, 2018
f527dcc
Trying to fix dimensions bug now that Travis CI is passing but others…
evangelinehl Dec 2, 2018
583ce49
tests failed so changing it back to when travis ci succeeded
evangelinehl Dec 2, 2018
01a2c10
Changing it back because we're trying to figure out if concat_compat …
evangelinehl Dec 2, 2018
683dccf
Reverting back to version when all tests passed
evangelinehl Dec 3, 2018
857c6be
Restored blank lines
evangelinehl Dec 3, 2018
64da4c0
Added test case for the new tz output
evangelinehl Dec 3, 2018
9e699e4
Fixed style issues
evangelinehl Dec 3, 2018
64182c5
Fixed the whitespace issue in linting
jakezimmer Dec 3, 2018
b630d58
Merge branch 'master' into PR_TOOL_MERGE_PR_24027
jreback Dec 4, 2018
c7dcdb4
fix up
jreback Dec 4, 2018
165689e
updated whatsnew (v0.24.0) to reflect changes
jakezimmer Dec 4, 2018
43b2e2a
Merge branch 'master' into master
jakezimmer Dec 4, 2018
634c736
no changes since @jreback's fix up commit
jakezimmer Dec 5, 2018
0b86ef9
Update v0.24.0.rst
jakezimmer Dec 5, 2018
1867b3a
removed trailing whitespace
jakezimmer Dec 5, 2018
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
3 changes: 2 additions & 1 deletion pandas/core/dtypes/concat.py
Original file line number Diff line number Diff line change
Expand Up @@ -193,7 +193,8 @@ def _concat_categorical(to_concat, axis=0):

def _concat_asobject(to_concat):
to_concat = [x.get_values() if is_categorical_dtype(x.dtype)
else np.asarray(x).ravel() for x in to_concat]
else np.asarray(x).ravel() if not is_datetime64tz_dtype(x)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is not the way
_concat_conpat already handles this

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I didn’t realize we do call _concat_compat from here... I think we’ve already lost the tz by the time we call it.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So do these if checks not even need to be there? Not sure what this means for the change we had here

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jreback after closer examination, I'm not sure if _concat_compat actually already handles this. If the type is also a category, it simply calls _concat_categorical which then summons _concat_asobject. I think that therefore the change to the code should happen inside _concat_categorical or _concat_asobject.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jreback calling _concat_compat before the list is modified like this causes _concat_compat to just call _concat_categorical again and back and forth until they reach the maximum recursion depth. We have to modify the array somehow before that happens.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess I'm not sure why we aren't doing

to_concat = [np.asarray(x.astype(object)) for x in to_concat]

The function name is _concat_asobject, so let's convert to object and concat them. Can you see if that works?

Please write the test first though, and ensure that the test fails with master. Then implement the fix and make sure it works.

Copy link
Contributor Author

@evangelinehl evangelinehl Dec 3, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe I tried this previously but the build fails (I think that was the dimensions bug we had before) @TomAugspurger

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems like we have some inconsistencies in concat that's breaking this approach. IIRC there's an outstanding PR fixing some of these up.

I was going by the loose rule of "different dtypes means the result type is object". But things like

In [1]: import pandas as pd; import numpy as np; import pickle

In [2]: a = pd.Series([1, 2], dtype='category')

In [3]: b = pd.Series([1, None], dtype='category')

In [4]: pd.concat([a, b], ignore_index=True)
Out[4]:
0    1.0
1    2.0
2    1.0
3    NaN
dtype: float64

mess that up. Not sure what's best here...

else np.asarray(x.astype(object)) for x in to_concat]
res = _concat_compat(to_concat)
if axis == 1:
return res.reshape(1, len(res))
Expand Down