-
-
Notifications
You must be signed in to change notification settings - Fork 18.5k
BUG: pivot_table losing tz #32558
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
BUG: pivot_table losing tz #32558
Conversation
cc @jorisvandenbossche @TomAugspurger @jreback thoughts on adding something like |
What are your proposed semantics of an EA.tile? If it's restricted to a single dimension, can to be emulated using |
The base class implementation would probably just use _concat_same_type. For ndarray-backed EAs I'd use broadcast_to which doesn't make a copy, but results in a non-writeable array (in that sense it would behave more like broadcast_to than np.tile) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree we should just have EA.tile; this is very messy otherwise.
can you see if we have any issues about this (pivot and tz), I think we might. |
pandas/core/reshape/util.py
Outdated
|
||
def _broadcast_tile(arr: np.ndarray, num: int) -> np.ndarray: | ||
""" | ||
Emulate np.tile but using views instead of copies. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this an important optimization? Currently we also use np.tile
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For large inputs yes
We could also simplify this a lot by using Regarding tile in the EA interface: numpy also has no tile method, so it's not that it would make it more compatible with numpy arrays (for that, we would need to implement a protocol to make |
…-values_from_object-reshape
This is actually close to orthogonal to the broadcast_to optimization. When I disable the broadcat_to optimization and just use np.tile, this implementation is about 2x faster than one that uses _concat_same_type (based on |
searching on "pivot" didnt show anything tz-related |
Why? It seems the reason you are not using the slower/simpler
This is only for the tile? What does it give in the full pivot operation? |
No, its also because the non-concat_same_type implementation is 2x faster, even without the broadcast_to optimization. For arithmetic I'm soon going to need to broadcast/tile length=1 EAs to length=N, which is very wasteful without the broadcast_to; that is the forward-looking motivation for using the broadcast_to optimization.
No idea. |
…-values_from_object-reshape
Updated to use a simpler implementation that does not do any optimizations. |
looks fine, can you add a release note, ping on green. |
…-values_from_object-reshape
…-values_from_object-reshape
…-values_from_object-reshape
ping |
thxs |
booyah! there goes our last usage of values_from_object |
black pandas
git diff upstream/master -u -- "*.py" | flake8 --diff
This gets rid of the last values_from_object usage (pending other PRs)
_tile_compat
likely makes more sense as an Index method. I kept it here as a proof of concept because I think we actually need it in the EA interface too. Being able to broadcast a length=1 EA to a length=N EA will be necessary for some of the arithmetic perf going on.