-
-
Notifications
You must be signed in to change notification settings - Fork 18.4k
BUG: fix tzaware dataframe transpose bug #26825
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Codecov Report
@@ Coverage Diff @@
## master #26825 +/- ##
==========================================
- Coverage 92.01% 92% -0.01%
==========================================
Files 180 180
Lines 50754 50766 +12
==========================================
+ Hits 46699 46708 +9
- Misses 4055 4058 +3
Continue to review full report at Codecov.
|
@@ -160,7 +160,29 @@ def init_ndarray(values, index, columns, dtype=None, copy=False): | |||
# on the entire block; this is to convert if we have datetimelike's | |||
# embedded in an object type | |||
if dtype is None and is_object_dtype(values): | |||
values = maybe_infer_to_datetimelike(values) | |||
|
|||
if values.ndim == 2 and values.shape[0] != 1: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this is much more messy, can we change something else to make this nicer?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not really. I'm looking into the other places where maybe_infer_to_datetimelike is used to see if some of this can go into that. We could separate this whole block into a dedicated function. But one way or another we need to bite the bullet.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
so the inside of list loop should be in pandas.core.dtypes.cast, no? (obviously up until you make the blocks themselves)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd like to leave this for the next pass when I'm taking a more systematic look at maybe_infer_to_datetimelike
Updated with requested edit, GH references in tests, and whatsnew. |
pandas/core/groupby/generic.py
Outdated
@@ -1664,3 +1655,36 @@ def _normalize_keyword_aggregation(kwargs): | |||
order.append((column, | |||
com.get_callable_name(aggfunc) or aggfunc)) | |||
return aggspec, columns, order | |||
|
|||
|
|||
def _recast_datetimelike_result(result): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
so i would put this in pandas.core.dtypes.cast, our dumping ground for casting, right after / before maybe_infer_to-datetimelike (and you can de-privatize)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is really kludgy code (and is replacing equally kludgy code; kind of like turtles all the way down). I'd rather keep it close to its only usage and hope it is ripped out by someone who knows this part of the code better
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
as i said this look a lot like maybe_convert_objects, try to use that here
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Note also: we only have 5 tests that go through this path
|
||
Notes | ||
----- | ||
- Assumes Groupby._selected_obj has ndim==2 and at least one |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this note doesn't seem relevant as you are passing in frame right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That wasn't obvious to me bc were talking about the dimensions of two separate objects. Are they necessarily the same?
@@ -160,7 +160,29 @@ def init_ndarray(values, index, columns, dtype=None, copy=False): | |||
# on the entire block; this is to convert if we have datetimelike's | |||
# embedded in an object type | |||
if dtype is None and is_object_dtype(values): | |||
values = maybe_infer_to_datetimelike(values) | |||
|
|||
if values.ndim == 2 and values.shape[0] != 1: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
so the inside of list loop should be in pandas.core.dtypes.cast, no? (obviously up until you make the blocks themselves)
|
||
from pandas.core.internals.blocks import make_block | ||
|
||
# TODO: What about re-joining object columns? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
pls reuse the block creation routines below
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
attempts so far have broken everything. do you have a particular routine in mind?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
what I mean is you can remove the create_block_manager_from_blocks and let it fall thru to 184 with I think a very small change, e.g.
if .....
blocks = bvals
else:
dvals = .......
blocks = [dvals]
of course pls use a longer name than dvals
# unnecessary if we ever allow 2D DatetimeArray | ||
|
||
dvals_list = [maybe_infer_to_datetimelike(values[n, :]) | ||
for n in range(len(values))] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
make this a list comprehension & use enumerate if you must., this is very hard to read.
@jreback anything else here actionable? |
haven’t had a chance to relook |
|
||
from pandas.core.internals.blocks import make_block | ||
|
||
# TODO: What about re-joining object columns? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
what I mean is you can remove the create_block_manager_from_blocks and let it fall thru to 184 with I think a very small change, e.g.
if .....
blocks = bvals
else:
dvals = .......
blocks = [dvals]
of course pls use a longer name than dvals
I think comments have been addressed. |
This allows us to get rid of a bunch of xfails and FIXMEs in arithmetic tests. Changes in groupby are the most likely to need attention; this is not an area of the code I know well.
Not sure if TestTranspose belongs somewhere else. Suggestions welcome.
Split a couple of too-widely-scoped groupby tests.
Will follow-up with issues this closes, whatsnew entry, and GH references added to tests.
xref #23988
git diff upstream/master -u -- "*.py" | flake8 --diff