pivot_table margins bottom-left total does not correspond to other content when dropna=False #14072

Tuzoff · 2016-08-23T15:07:00Z

Code Sample, a copy-pastable example if possible

In:
M = pd.DataFrame([
[1, 'a', 'A'],
[1, 'b', 'B'],
[1, 'c', None]],
columns=['x', 'y', 'z'])
P = M.pivot_table(values='x', index='y', columns='z', aggfunc='sum', fill_value=0, margins=True, dropna=False)
P

Out:

z	A	B	All
y
a	1.0	0.0	1.0
b	0.0	1.0	1.0
All	1.0	1.0	3.0

Expected Output

Either

z	A	B	All
y
a	1.0	0.0	1.0
b	0.0	1.0	1.0
All	1.0	1.0	2.0

or

z	A	B	None	All
y
a	1.0	0.0	0.0	1.0
b	0.0	1.0	0.0	1.0
c	0.0	0.0	1.0	1.0
All	1.0	1.0	1.0	3.0

depending on dropna's meaning for this case

output of `pd.show_versions()`

INSTALLED VERSIONS

commit: None
python: 3.4.3.final.0
python-bits: 64
OS: Linux
OS-release: 4.2.0-36-generic
machine: x86_64
processor: x86_64
byteorder: little
LC_ALL: None
LANG: None

pandas: 0.18.1
nose: 1.3.1
pip: 8.1.2
setuptools: 22.0.5
Cython: 0.20.1post0
numpy: 1.11.0
scipy: 0.17.1
statsmodels: None
xarray: None
IPython: 4.2.0
sphinx: None
patsy: 0.4.1
dateutil: 2.5.3
pytz: 2016.4
blosc: None
bottleneck: None
tables: None
numexpr: 2.6.0
matplotlib: 1.5.1
openpyxl: None
xlrd: None
xlwt: None
xlsxwriter: None
lxml: None
bs4: 4.5.1
html5lib: None
httplib2: None
apiclient: None
sqlalchemy: None
pymysql: None
psycopg2: None
jinja2: 2.8
boto: None
pandas_datareader: None

The text was updated successfully, but these errors were encountered:

sinhrks · 2016-08-25T21:58:39Z

Thanks for the report. _compute_grand_margin seems not to care dropna. PR is appreciated.

https://github.com/pydata/pandas/blob/master/pandas/tools/pivot.py#L239

OXPHOS · 2016-09-13T20:33:50Z

@sinhrks Hey I am thinking about fixing this bug. But I think with dropna=False the correct output should be the bottom one:

z	A	B	None	All
y
a	1.0	0.0	0.0	1.0
b	0.0	1.0	0.0	1.0
c	0.0	0.0	1.0	1.0
All	1.0	1.0	1.0	3.0

So the bug would be in groupby, as groupby drops the None column. Or do you think the None column should be dropped even with dropna=False?

jorisvandenbossche · 2016-09-13T22:09:51Z

@OXPHOS The dropna keyword is to control the behaviour of keeping/dropping columns for which the values are all NaN (so not for the column name which could be NaN). So I think for this issue (the margin not being correct) should not depend on the value of the dropna kwarg.

sinhrks added Bug Reshaping Concat, Merge/Join, Stack/Unstack, Explode labels Aug 25, 2016

jreback added this to the Next Major Release milestone Sep 13, 2016

OXPHOS mentioned this issue Sep 18, 2016

Pivot table drops column/index names=nan when dropna=false #14246

Closed

4 tasks

OXPHOS mentioned this issue Apr 26, 2017

BUG:Pivot table drops column/index names=nan when dropna=false #16142

Closed

4 tasks

mroeschke removed this from the Contributions Welcome milestone Oct 13, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

pivot_table margins bottom-left total does not correspond to other content when dropna=False #14072

pivot_table margins bottom-left total does not correspond to other content when dropna=False #14072

Tuzoff commented Aug 23, 2016

sinhrks commented Aug 25, 2016

OXPHOS commented Sep 13, 2016 •

edited

Loading

jorisvandenbossche commented Sep 13, 2016

pivot_table margins bottom-left total does not correspond to other content when dropna=False #14072

pivot_table margins bottom-left total does not correspond to other content when dropna=False #14072

Comments

Tuzoff commented Aug 23, 2016

Code Sample, a copy-pastable example if possible

Expected Output

output of pd.show_versions()

INSTALLED VERSIONS

sinhrks commented Aug 25, 2016

OXPHOS commented Sep 13, 2016 • edited Loading

jorisvandenbossche commented Sep 13, 2016

output of `pd.show_versions()`

OXPHOS commented Sep 13, 2016 •

edited

Loading