Skip to content

BUG: upcasting on reshaping ops #13247

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
jennolsen84 opened this issue May 21, 2016 · 6 comments
Closed

BUG: upcasting on reshaping ops #13247

jennolsen84 opened this issue May 21, 2016 · 6 comments
Labels
Bug Dtype Conversions Unexpected or buggy dtype conversions Missing-data np.nan, pd.NaT, pd.NA, dropna, isnull, interpolate Reshaping Concat, Merge/Join, Stack/Unstack, Explode
Milestone

Comments

@jennolsen84
Copy link
Contributor

Code Sample, a copy-pastable example if possible

import pandas as pd
import numpy as np

pn1 = pd.Panel(np.array([1.0], dtype=np.float32, ndmin=3)) # panel of 1.0
pn2 = pd.Panel(np.array([np.nan], dtype=np.float32, ndmin=3)) # panel of nan

print(pd.concat([pn1, pn1]).values.dtype)
print(pd.concat([pn2, pn2]).values.dtype)

Current Output

float32
float64

Expected Output

float32
float32

output of pd.show_versions()

INSTALLED VERSIONS
------------------
commit: None
python: 3.5.1.final.0
python-bits: 64
OS: Darwin
OS-release: 14.5.0
machine: x86_64
processor: i386
byteorder: little
LC_ALL: None
LANG: en_US.UTF-8

pandas: 0.18.1
nose: 1.3.7
pip: 8.1.0
setuptools: 20.2.2
Cython: 0.23.4
numpy: 1.11.0
scipy: 0.17.0
statsmodels: None
xarray: None
IPython: 4.0.1
sphinx: 1.3.1
patsy: None
dateutil: 2.4.2
pytz: 2015.7
blosc: None
bottleneck: 1.1.0dev
tables: 3.2.2
numexpr: 2.5.1
matplotlib: 1.5.1
openpyxl: None
xlrd: None
xlwt: None
xlsxwriter: None
lxml: None
bs4: None
html5lib: None
httplib2: 0.9.2
apiclient: None
sqlalchemy: 1.0.12
pymysql: 0.6.7.None
psycopg2: 2.6.1 (dt dec pq3 ext)
jinja2: 2.8
boto: None
pandas_datareader: None
@sinhrks sinhrks added Bug Reshaping Concat, Merge/Join, Stack/Unstack, Explode Dtype Conversions Unexpected or buggy dtype conversions labels May 21, 2016
@sinhrks
Copy link
Member

sinhrks commented May 21, 2016

Thanks for the report. Though Panel is being replaced with xarray near future, sending a PR is appreciated.

@jennolsen84
Copy link
Contributor Author

@sinhrks same behavior with DataFrame, is that going away too?

@jennolsen84
Copy link
Contributor Author

In [7]: pn1 = pd.DataFrame(np.array([1.0], dtype=np.float32, ndmin=2)) # df of 1.0

In [8]: pn2 = pd.DataFrame(np.array([np.nan], dtype=np.float32, ndmin=2)) # df of nan

In [9]: print(pd.concat([pn2, pn2]).values.dtype)
float64

In [10]: print(pd.concat([pn1, pn1]).values.dtype)
float32

@jreback
Copy link
Contributor

jreback commented May 21, 2016

yeah, I suspect the null forces immediate upcast to float64. This should be cognizant that floats are available in their native itemsize even w/NaN's (and so should take a size that can deal).

I don't think this is very hard to fix in a general way.

care to take a stab?

@jreback jreback added Missing-data np.nan, pd.NaT, pd.NA, dropna, isnull, interpolate Difficulty Intermediate labels May 21, 2016
@jreback jreback added this to the Next Major Release milestone May 21, 2016
@jreback jreback changed the title Inconsistent pd.concat behavior BUG: upcasting on reshaping ops May 21, 2016
@jennolsen84
Copy link
Contributor Author

@jreback how's this for starters? jennolsen84@3b3797a . I can add the tests, whatsnew, etc. if it looks good to you.

I did some tests and it seems to work for me for float types. I did a quick check with ints, and that seemed to work just fine as well (I am guessing it just uses np rules there).

I also checked concating int32 + float16, and in that case we use float64. In that case, one might want a float64, so I left that alone.

@jreback
Copy link
Contributor

jreback commented May 31, 2016

@jennolsen84 some comments. Ideally make this as general as possible, but the type stuff is a bit all over the place now #13147 should fix this a bit more (though that independent of this)

jennolsen84 added a commit to jennolsen84/pandas that referenced this issue Jun 1, 2016
jaehoonhwang added a commit to jaehoonhwang/pandas that referenced this issue Mar 6, 2017
Only rebasing and fixing the merge conflicts
Original work done by: jennolsen84
Original branch: https://github.com/jennolsen84/pandas/tree/concatnan
@jreback jreback modified the milestones: 0.20.0, Next Major Release Mar 14, 2017
AnkurDedania pushed a commit to AnkurDedania/pandas that referenced this issue Mar 21, 2017
Original work done by @jennolsen84, in pandas-dev#13337

closes pandas-dev#13247

Author: Jaehoon Hwang <[email protected]>
Author: Jae <[email protected]>

Closes pandas-dev#15594 from jaehoonhwang/Bug13247 and squashes the following commits:

3cd1734 [Jaehoon Hwang] Pass the non-related tests in test_partial and test_reshape
1fa578b [Jaehoon Hwang] Applying request changes removing unnecessary test and renameing
6744636 [Jaehoon Hwang] Merge remote-tracking branch 'pandas-dev/master' into Bug13247
5bb72c7 [Jaehoon Hwang] Merge remote-tracking branch 'pandas-dev/master' into Bug13247
a1d5d40 [Jaehoon Hwang] Completed pytest
8122359 [Jaehoon Hwang] Merge remote-tracking branch 'pandas-dev/master' into Bug13247
0e52b74 [Jaehoon Hwang] Working: Except for pytest
8fec07c [Jaehoon Hwang] Fix: test_concat.py and internals.py
4f6c03e [Jaehoon Hwang] Fix: is_float_dtypes and is_numeric_dtype wrong place
d3476c0 [Jaehoon Hwang] Merge branch 'master' into Bug13247
b977615 [Jaehoon Hwang] Merge remote-tracking branch 'pandas-dev/master'
4b1e5c6 [Jaehoon Hwang] Merge remote-tracking branch 'pandas-dev/master' into Bug13247
45f7ae9 [Jaehoon Hwang] Added pytest function
468baee [Jae] BUG: upcasting on reshaping ops pandas-dev#13247
mattip pushed a commit to mattip/pandas that referenced this issue Apr 3, 2017
Original work done by @jennolsen84, in pandas-dev#13337

closes pandas-dev#13247

Author: Jaehoon Hwang <[email protected]>
Author: Jae <[email protected]>

Closes pandas-dev#15594 from jaehoonhwang/Bug13247 and squashes the following commits:

3cd1734 [Jaehoon Hwang] Pass the non-related tests in test_partial and test_reshape
1fa578b [Jaehoon Hwang] Applying request changes removing unnecessary test and renameing
6744636 [Jaehoon Hwang] Merge remote-tracking branch 'pandas-dev/master' into Bug13247
5bb72c7 [Jaehoon Hwang] Merge remote-tracking branch 'pandas-dev/master' into Bug13247
a1d5d40 [Jaehoon Hwang] Completed pytest
8122359 [Jaehoon Hwang] Merge remote-tracking branch 'pandas-dev/master' into Bug13247
0e52b74 [Jaehoon Hwang] Working: Except for pytest
8fec07c [Jaehoon Hwang] Fix: test_concat.py and internals.py
4f6c03e [Jaehoon Hwang] Fix: is_float_dtypes and is_numeric_dtype wrong place
d3476c0 [Jaehoon Hwang] Merge branch 'master' into Bug13247
b977615 [Jaehoon Hwang] Merge remote-tracking branch 'pandas-dev/master'
4b1e5c6 [Jaehoon Hwang] Merge remote-tracking branch 'pandas-dev/master' into Bug13247
45f7ae9 [Jaehoon Hwang] Added pytest function
468baee [Jae] BUG: upcasting on reshaping ops pandas-dev#13247
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Bug Dtype Conversions Unexpected or buggy dtype conversions Missing-data np.nan, pd.NaT, pd.NA, dropna, isnull, interpolate Reshaping Concat, Merge/Join, Stack/Unstack, Explode
Projects
None yet
3 participants