ERR: raise on missing values in pd.pivot_table #14965

Dr-Irv · 2016-12-22T20:15:28Z

closes ERR: No Error when values argument in pivot_table is not in df.columns #14938
tests added - for ERR: No Error when values argument in pivot_table is not in df.columns #14938
passes git diff upstream/master | flake8 --diff
whatsnew entry

max-sixty · 2016-12-22T20:30:07Z

pandas/tools/pivot.py

+        # GH14938 Make sure values are in data
+        for i in values:
+            if i not in data:
+                raise KeyError(i)


if set(values) - set(data):

@MaximilianR That won't let the user know the specific values that were missing. Not sure if we want to raise a KeyError with a text of all the values, as it seems with other functions, pandas raises a KeyError for the first missing key.

Good point. Something like this would list them all:

missing_values = set(values) - set(data) if missing_values: raise KeyError('{} missing in [better message]'.format(', '.join(missing_values)

But if you have strong view on looping then OK

IMHO, it's better to be consistent with behavior on the other arguments. For example, with groupby, if you do this:

df=pd.DataFrame({'a' : [i % 3 for i in range(10)], 'b': [i % 2 for i in range(10)], 'c': np.random.randn(10)}) df.groupby(['y','z']).sum()

The KeyError is raised on the value of 'y', not on 'z'. So I think the error raised on the values argument should be consistent. Otherwise it begs the question why we don't show all of the KeyError issues on other arguments for many other functions.

codecov-io · 2016-12-22T23:57:31Z

Current coverage is 84.66% (diff: 66.66%)

Merging #14965 into master will decrease coverage by <.01%

@@             master     #14965   diff @@
==========================================
  Files           144        144          
  Lines         51043      51046     +3   
  Methods           0          0          
  Messages          0          0          
  Branches          0          0          
==========================================
+ Hits          43214      43216     +2   
- Misses         7829       7830     +1   
  Partials          0          0

Powered by Codecov. Last update 45910ae...ff4df5a

jreback · 2016-12-23T00:04:22Z

doc/source/whatsnew/v0.20.0.txt

@@ -300,3 +300,4 @@ Bug Fixes
 - Bug in ``Series.unique()`` in which unsigned 64-bit integers were causing overflow (:issue:`14721`)
 - Require at least 0.23 version of cython to avoid problems with character encodings (:issue:`14699`)
 - Bug in converting object elements of array-like objects to unsigned 64-bit integers (:issue:`4471`)
+- Bug in ``pivot_table()`` where no error was raised when values argument was not in df.columns (:issue:`14938`)


pd.pivot_table()

say not in the columns

jreback · 2016-12-23T00:06:12Z

pandas/tools/pivot.py

@@ -106,6 +106,10 @@ def pivot_table(data, values=None, index=None, columns=None, aggfunc='mean',
        else:
            values_multi = False
            values = [values]
+        # GH14938 Make sure values are in data


blank line before the comment.

say valuess indexer (or labels) (to clear these are not actual values, but the indexers of the labels)

jreback · 2016-12-23T00:07:26Z

very minor comments. ping on green.

Dr-Irv · 2016-12-23T19:48:13Z

@jreback Changes made as you requested and all green.

jreback · 2016-12-23T20:50:00Z

thanks!

…as-dev#14965)

max-sixty reviewed Dec 22, 2016

View reviewed changes

jreback reviewed Dec 23, 2016

View reviewed changes

jreback changed the title ~~BUG: Fix #14938~~ ERR: raise on missing values in pd.pivot_table Dec 23, 2016

jreback added the Error Reporting Incorrect or improved errors from pandas label Dec 23, 2016

jreback added this to the 0.20.0 milestone Dec 23, 2016

ERR: raise on missing values in pd.pivot_table pandas-dev#14938

ff4df5a

Dr-Irv force-pushed the error14938 branch from c42cd31 to ff4df5a Compare December 23, 2016 18:49

jreback merged commit 8f7ba1b into pandas-dev:master Dec 23, 2016

Dr-Irv deleted the error14938 branch December 23, 2016 20:52

ShaharBental pushed a commit to ShaharBental/pandas that referenced this pull request Dec 26, 2016

ERR: raise on missing values in pd.pivot_table pandas-dev#14938 (pand…

5dcdfbf

…as-dev#14965)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

ERR: raise on missing values in pd.pivot_table #14965

ERR: raise on missing values in pd.pivot_table #14965

Uh oh!

Dr-Irv commented Dec 22, 2016

Uh oh!

max-sixty Dec 22, 2016

Uh oh!

Dr-Irv Dec 22, 2016

Uh oh!

max-sixty Dec 22, 2016

Uh oh!

Dr-Irv Dec 22, 2016

Uh oh!

codecov-io commented Dec 22, 2016 •

edited

Loading

Uh oh!

jreback Dec 23, 2016

Uh oh!

jreback Dec 23, 2016

Uh oh!

jreback Dec 23, 2016

Uh oh!

jreback commented Dec 23, 2016

Uh oh!

Dr-Irv commented Dec 23, 2016

Uh oh!

jreback commented Dec 23, 2016

Uh oh!

Uh oh!

Uh oh!

ERR: raise on missing values in pd.pivot_table #14965

ERR: raise on missing values in pd.pivot_table #14965

Uh oh!

Conversation

Dr-Irv commented Dec 22, 2016

Uh oh!

max-sixty Dec 22, 2016

Choose a reason for hiding this comment

Uh oh!

Dr-Irv Dec 22, 2016

Choose a reason for hiding this comment

Uh oh!

max-sixty Dec 22, 2016

Choose a reason for hiding this comment

Uh oh!

Dr-Irv Dec 22, 2016

Choose a reason for hiding this comment

Uh oh!

codecov-io commented Dec 22, 2016 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Current coverage is 84.66% (diff: 66.66%)

Uh oh!

jreback Dec 23, 2016

Choose a reason for hiding this comment

Uh oh!

jreback Dec 23, 2016

Choose a reason for hiding this comment

Uh oh!

jreback Dec 23, 2016

Choose a reason for hiding this comment

Uh oh!

jreback commented Dec 23, 2016

Uh oh!

Dr-Irv commented Dec 23, 2016

Uh oh!

jreback commented Dec 23, 2016

Uh oh!

Uh oh!

codecov-io commented Dec 22, 2016 •

edited

Loading