Skip to content

unstack() on DataFrame of ints with MultiIndex which was sliced casts to float #17845

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
toobaz opened this issue Oct 11, 2017 · 2 comments · Fixed by #18460
Closed

unstack() on DataFrame of ints with MultiIndex which was sliced casts to float #17845

toobaz opened this issue Oct 11, 2017 · 2 comments · Fixed by #18460
Labels
Dtype Conversions Unexpected or buggy dtype conversions Reshaping Concat, Merge/Join, Stack/Unstack, Explode
Milestone

Comments

@toobaz
Copy link
Member

toobaz commented Oct 11, 2017

Code Sample, a copy-pastable example if possible

In [2]: idx = pd.MultiIndex.from_product([['a'], ['A', 'B', 'C', 'D']])[:-1]

In [3]: df = pd.DataFrame([[1, 0]]*3, index=idx)

In [4]: df.unstack()
Out[4]: 
     0              1          
     A    B    C    A    B    C
a  1.0  1.0  1.0  0.0  0.0  0.0

Problem description

Since there is no missing value, no type casting should occur. Notice that even replacing index=idx with index=idx.copy(deep=True) doesn't help (something which would maybe require a fix on its own).

Expected Output

In [5]: df.unstack(fill_value=3)
Out[5]: 
   0        1      
   A  B  C  A  B  C
a  1  1  1  0  0  0

Output of pd.show_versions()

INSTALLED VERSIONS

commit: None
python: 3.5.3.final.0
python-bits: 64
OS: Linux
OS-release: 4.9.0-3-amd64
machine: x86_64
processor:
byteorder: little
LC_ALL: None
LANG: it_IT.UTF-8
LOCALE: it_IT.UTF-8

pandas: 0.21.0.dev+572.g8e89cb3e1
pytest: 3.0.6
pip: 9.0.1
setuptools: None
Cython: 0.25.2
numpy: 1.12.1
scipy: 0.19.0
pyarrow: None
xarray: None
IPython: 5.1.0.dev
sphinx: 1.5.6
patsy: 0.4.1
dateutil: 2.6.0
pytz: 2017.2
blosc: None
bottleneck: 1.2.1
tables: 3.3.0
numexpr: 2.6.1
feather: 0.3.1
matplotlib: 2.0.2
openpyxl: None
xlrd: 1.0.0
xlwt: 1.1.2
xlsxwriter: 0.9.6
lxml: None
bs4: 4.5.3
html5lib: 0.999999999
sqlalchemy: 1.0.15
pymysql: None
psycopg2: None
jinja2: 2.9.6
s3fs: None
fastparquet: None
pandas_gbq: None
pandas_datareader: 0.2.1

@gfyoung gfyoung added the Reshaping Concat, Merge/Join, Stack/Unstack, Explode label Oct 12, 2017
@jreback
Copy link
Contributor

jreback commented Oct 13, 2017

IIRC there are several related / like issues to this, see if you can find them. note that (though not as satisfying). Yes in thins case you have a full cartesian product so you know there are no missing values, but generally this is not true.

In [33]: df.unstack(fill_value=0)
Out[33]: 
   0        1      
   A  B  C  A  B  C
a  1  1  1  0  0  0

@jreback jreback added the Dtype Conversions Unexpected or buggy dtype conversions label Oct 13, 2017
@jreback jreback added this to the 0.22.0 milestone Nov 24, 2017
@saten25
Copy link

saten25 commented Apr 20, 2020

Hi Everyone,

I think i am facing a similar kind of issue.
I have created a django utility that is using pandas profiling.

The steps that i am doing is python are-
1-Uploading a file
df = pd.read_csv(path)

2- Changing the datatypes of columns from objects to category
df.select_dtypes(['object']).apply(pd.Series.astype, dtype='category')], axis=1).reindex(df.columns, axis=1)

3- Separating continuous and categorical variables from the dataframe and sending this information to multiselect dropdown in HTML

4- As per user selection from continuous and categorical variables dropdown, storing the selections in form of list.
context['selectedContinousValues'] = request.POST.getlist('continous')
context['selectedCategoricalValues'] = request.POST.getlist('categorical')

5- Combining both list and keeping the list items from df.
Profiling_variables=context['selectedContinousValues']+context['selectedCategoricalValues']
print(Profiling_variables)
profile=df[Profiling_variables]

6- Using this selected df in pandas profiling
profile_final = ProfileReport(profile)

Error: Error is coming when profile_final is having all categorical variables. I am using pandas profiling version 2.6.
Please note that error is not hitting in following cases:
1- Selecting categorical variables only.
2- if we are not changing the datatypes of columns from objects to category. I have added this steps because i wanted to include a functionality where user could convert selected continuous variable to categorical variables. i tried to change them from continuous to object but as soon as this object variable(say eg: variable having values like 1,2,3,4,1,2,3,4,1,2,3,4 where i want to treat this variable as object) is being used by pandas profiling, it is considering this object variable as numeric in output. but i want to treat it as categories.

Error message:

Request Method: POST
http://localhost:8000/selected-data/
3.0.5
AssertionError Gaps in blk ref_locs
C:\Users\srana12\AppData\Local\Continuum\anaconda3\lib\site-packages\pandas\core\internals\managers.py in _rebuild_blknos_and_blklocs, line 231
C:\Users\srana12\AppData\Local\Continuum\anaconda3\python.exe
3.7.4
['C:\Users\srana12\Desktop\EDA Pandas Profiling V4\file-upload', 'C:\Users\srana12\AppData\Local\Continuum\anaconda3\python37.zip', 'C:\Users\srana12\AppData\Local\Continuum\anaconda3\DLLs', 'C:\Users\srana12\AppData\Local\Continuum\anaconda3\lib', 'C:\Users\srana12\AppData\Local\Continuum\anaconda3', 'C:\Users\srana12\AppData\Local\Continuum\anaconda3\lib\site-packages', 'C:\Users\srana12\AppData\Local\Continuum\anaconda3\lib\site-packages\django_crispy_forms-1.9.0-py3.7.egg', 'C:\Users\srana12\AppData\Local\Continuum\anaconda3\lib\site-packages\win32', 'C:\Users\srana12\AppData\Local\Continuum\anaconda3\lib\site-packages\win32\lib', 'C:\Users\srana12\AppData\Local\Continuum\anaconda3\lib\site-packages\Pythonwin', 'C:\Users\srana12\AppData\Local\Continuum\anaconda3\lib\site-packages\IPython\extensions']

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Dtype Conversions Unexpected or buggy dtype conversions Reshaping Concat, Merge/Join, Stack/Unstack, Explode
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants