-
-
Notifications
You must be signed in to change notification settings - Fork 18.5k
BUG: upcasting on reshaping ops #13247 #13337
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
||
for dt in (np.sctypes['int'] + np.sctypes['uint']): | ||
dfs = [pdt(np.array([1], dtype=dt, ndmin=dims)), | ||
pdt(np.array([5], dtype=dt, ndmin=dims))] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can you add some tests that have mixed dtypes in the input as well (e.g. mixed float32/64 and int types)
@jreback: nice catch, I did some manual testing of that before, and it passed... but extended testing yields different answers for these cases. What is the "right" thing to do here? Current behavior: Series/DataFrame:
Panel/Panel4D:(no nans) I think the right choice would be |
Current coverage is 84.23%@@ master #13337 diff @@
==========================================
Files 138 138
Lines 50712 50720 +8
Methods 0 0
Messages 0 0
Branches 0 0
==========================================
+ Hits 42714 42720 +6
- Misses 7998 8000 +2
Partials 0 0
|
I think any int/float casting can simply go to float64. Yes you can get creative, e.g. I think |
I updated the PR... but see below: OK, so for the panel case, it seems like we go into mixed types ( So in order to really fix it, one has to look at The function here: https://github.com/pydata/pandas/blob/4de83d25d751d8ca102867b2d46a5547c01d7248/pandas/core/internals.py#L4390-L4449 , takes in float and int blocks, and then falls through to the last line:
The above lead to this change in the PR:
I was also able to get rid of multiple ifs (the ones you wanted to move to a function), so the code is a bit more cleaner. The side effect is that an empty DataFrame's dtype is object, instead of float64 (in some cases). I am guessing that we should be consistent and perhaps always have a type
|
@jennolsen84 |
@@ -4443,6 +4444,8 @@ def _lcd_dtype(l): | |||
return np.dtype('int%s' % (lcd.itemsize * 8 * 2)) | |||
return lcd | |||
|
|||
elif have_int and have_float and not have_complex: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
side issue: may want to use np.common_type
here to return lcd of the itemsize for float.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
pandas.types.cast._find_common_type
I think will work for this now
can you rebase / update? |
I have hit this bug through Dask DataFrame repartition, so I would love to see this issue fixed. Updated: I have tried to install pandas from the fork and it worked for me! Thank you! |
@@ -43,7 +43,7 @@ Backwards incompatible API changes | |||
.. _whatsnew_0190.api: | |||
|
|||
|
|||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
move to 0.20.0
@jennolsen84 thanks for this. too many open PR's and haven't gone thru in a while. if you can update we can merge soon. |
@jennolsen84 can you rebase |
1 similar comment
@jennolsen84 can you rebase |
Hi, I rebased this code to latest commit on master, but there were merge conflicts. So, I rewrote them and I want to do a pr on this page. |
@jaehoonhwang what you do is branch off of current master. Then pull in these commits, fixing any conflicts. You then push to your branch and treat like any other fix. The only trick here is to cherry-pick the original commits off of this PR. |
@jaehoonhwang You cannot replace others PR with yours, so just create a new PR and reference this PR in it. |
closing in favor of #15594 (rebase of this PR). |
Original work done by @jennolsen84, in pandas-dev#13337 closes pandas-dev#13247 Author: Jaehoon Hwang <[email protected]> Author: Jae <[email protected]> Closes pandas-dev#15594 from jaehoonhwang/Bug13247 and squashes the following commits: 3cd1734 [Jaehoon Hwang] Pass the non-related tests in test_partial and test_reshape 1fa578b [Jaehoon Hwang] Applying request changes removing unnecessary test and renameing 6744636 [Jaehoon Hwang] Merge remote-tracking branch 'pandas-dev/master' into Bug13247 5bb72c7 [Jaehoon Hwang] Merge remote-tracking branch 'pandas-dev/master' into Bug13247 a1d5d40 [Jaehoon Hwang] Completed pytest 8122359 [Jaehoon Hwang] Merge remote-tracking branch 'pandas-dev/master' into Bug13247 0e52b74 [Jaehoon Hwang] Working: Except for pytest 8fec07c [Jaehoon Hwang] Fix: test_concat.py and internals.py 4f6c03e [Jaehoon Hwang] Fix: is_float_dtypes and is_numeric_dtype wrong place d3476c0 [Jaehoon Hwang] Merge branch 'master' into Bug13247 b977615 [Jaehoon Hwang] Merge remote-tracking branch 'pandas-dev/master' 4b1e5c6 [Jaehoon Hwang] Merge remote-tracking branch 'pandas-dev/master' into Bug13247 45f7ae9 [Jaehoon Hwang] Added pytest function 468baee [Jae] BUG: upcasting on reshaping ops pandas-dev#13247
Original work done by @jennolsen84, in pandas-dev#13337 closes pandas-dev#13247 Author: Jaehoon Hwang <[email protected]> Author: Jae <[email protected]> Closes pandas-dev#15594 from jaehoonhwang/Bug13247 and squashes the following commits: 3cd1734 [Jaehoon Hwang] Pass the non-related tests in test_partial and test_reshape 1fa578b [Jaehoon Hwang] Applying request changes removing unnecessary test and renameing 6744636 [Jaehoon Hwang] Merge remote-tracking branch 'pandas-dev/master' into Bug13247 5bb72c7 [Jaehoon Hwang] Merge remote-tracking branch 'pandas-dev/master' into Bug13247 a1d5d40 [Jaehoon Hwang] Completed pytest 8122359 [Jaehoon Hwang] Merge remote-tracking branch 'pandas-dev/master' into Bug13247 0e52b74 [Jaehoon Hwang] Working: Except for pytest 8fec07c [Jaehoon Hwang] Fix: test_concat.py and internals.py 4f6c03e [Jaehoon Hwang] Fix: is_float_dtypes and is_numeric_dtype wrong place d3476c0 [Jaehoon Hwang] Merge branch 'master' into Bug13247 b977615 [Jaehoon Hwang] Merge remote-tracking branch 'pandas-dev/master' 4b1e5c6 [Jaehoon Hwang] Merge remote-tracking branch 'pandas-dev/master' into Bug13247 45f7ae9 [Jaehoon Hwang] Added pytest function 468baee [Jae] BUG: upcasting on reshaping ops pandas-dev#13247
Original work done by @jennolsen84, in pandas-dev#13337 closes pandas-dev#13247 Author: Jaehoon Hwang <[email protected]> Author: Jae <[email protected]> Closes pandas-dev#15594 from jaehoonhwang/Bug13247 and squashes the following commits: 3cd1734 [Jaehoon Hwang] Pass the non-related tests in test_partial and test_reshape 1fa578b [Jaehoon Hwang] Applying request changes removing unnecessary test and renameing 6744636 [Jaehoon Hwang] Merge remote-tracking branch 'pandas-dev/master' into Bug13247 5bb72c7 [Jaehoon Hwang] Merge remote-tracking branch 'pandas-dev/master' into Bug13247 a1d5d40 [Jaehoon Hwang] Completed pytest 8122359 [Jaehoon Hwang] Merge remote-tracking branch 'pandas-dev/master' into Bug13247 0e52b74 [Jaehoon Hwang] Working: Except for pytest 8fec07c [Jaehoon Hwang] Fix: test_concat.py and internals.py 4f6c03e [Jaehoon Hwang] Fix: is_float_dtypes and is_numeric_dtype wrong place d3476c0 [Jaehoon Hwang] Merge branch 'master' into Bug13247 b977615 [Jaehoon Hwang] Merge remote-tracking branch 'pandas-dev/master' 4b1e5c6 [Jaehoon Hwang] Merge remote-tracking branch 'pandas-dev/master' into Bug13247 45f7ae9 [Jaehoon Hwang] Added pytest function 468baee [Jae] BUG: upcasting on reshaping ops pandas-dev#13247
git diff upstream/master | flake8 --diff