Skip to content

FIX raise when groupby selecting cols not in frame #6578

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
May 2, 2014

Conversation

hayd
Copy link
Contributor

@hayd hayd commented Mar 9, 2014

KeyError part of #5264

@jreback jreback added this to the 0.14.0 milestone Mar 9, 2014
@jreback
Copy link
Contributor

jreback commented Mar 9, 2014

rebase on master .....setuptools had changed so build was failing

the 2.6 failure is real though...

@hayd
Copy link
Contributor Author

hayd commented Mar 9, 2014

lol to the failure, similar to me just claiming was an issue in 2.6. Silly.

not self.as_index):
if isinstance(key, (list, tuple, Series, np.ndarray)):
if len(self.obj.columns.intersection(key)) != len(key):
bad_keys = list(set(key).difference(self.obj.columns))
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jreback Thinking about this, do you know if there is a helper function in loc already that might be better than this hack?

@jreback
Copy link
Contributor

jreback commented Mar 9, 2014

maybe slightly more intuitive, but yours pretty self exaplanatory

diff --git a/pandas/core/groupby.py b/pandas/core/groupby.py
index f0ee01d..fce8947 100644
--- a/pandas/core/groupby.py
+++ b/pandas/core/groupby.py
@@ -2579,8 +2579,8 @@ class DataFrameGroupBy(NDFrameGroupBy):
             raise Exception('Column(s) %s already selected' % self._selection)

         if isinstance(key, (list, tuple, Series, np.ndarray)):
-            if len(self.obj.columns.intersection(key)) != len(key):
-                bad_keys = list(set(key).difference(self.obj.columns))
+            if self.obj.columns.isin(key).sum() != len(key):
+                bad_keys = Index(key)-self.obj.columns
                 raise KeyError("Columns not found: %s"
                                % str(bad_keys)[1:-1])
             return DataFrameGroupBy(self.obj, self.grouper, selection=key,

@hayd
Copy link
Contributor Author

hayd commented Mar 9, 2014

subtle improveent is that yours doesn't allow for repeats... whhich is not supported... should it be :s (rabbit hole?)

@jreback
Copy link
Contributor

jreback commented Mar 9, 2014

hmm

I suppose that should technically be allowed
so go with yours

@hayd
Copy link
Contributor Author

hayd commented Mar 10, 2014

No methods work at the moment (as you're reindexing by a dupe), will have a little think.

@jreback
Copy link
Contributor

jreback commented Mar 10, 2014

the reindexing could be fixed, but i think you are right, it opens a big can of worms.So pls add a check and test for dups (just raise ValueError) for now.

@hayd
Copy link
Contributor Author

hayd commented Mar 10, 2014

Now I think about it, not tested axis=1...at all (I'm guessing this won't work, I could just raise as NotImplemented ) :S

@jreback
Copy link
Contributor

jreback commented Apr 5, 2014

more tests on this for axis=1? release notes?

@jreback
Copy link
Contributor

jreback commented Apr 10, 2014

ping!

@jreback
Copy link
Contributor

jreback commented Apr 21, 2014

ping

@jreback
Copy link
Contributor

jreback commented Apr 27, 2014

@hayd ping....need to get this in ASAP

@jreback
Copy link
Contributor

jreback commented Apr 28, 2014

this looks fine. anything else on this? give ok and I'll merge it (I want to update the release note a bit)

@hayd
Copy link
Contributor Author

hayd commented Apr 29, 2014

Will have some time for this late tomorrow (and wed and thurs), sorry been swamped recently, need to pull finger out on many PRs lined up mañana.

I see point on axis=1, I'm worried machinery is different... and not played with this.

@jreback
Copy link
Contributor

jreback commented May 1, 2014

ping!

jreback added a commit that referenced this pull request May 2, 2014
FIX raise when groupby selecting cols not in frame
@jreback jreback merged commit 2d876be into pandas-dev:master May 2, 2014
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants