-
-
Notifications
You must be signed in to change notification settings - Fork 18.4k
ENH: drop function now has errors keyword for non-existing column handling #6736
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
@@ -6762,6 +6762,26 @@ def test_drop_names(self): | |||
self.assertEqual(obj.columns.name, 'second') | |||
self.assertEqual(list(df.columns), ['d', 'e', 'f']) | |||
|
|||
self.assertRaises(ValueError, df.drop, ['g']) | |||
self.assertRaises(ValueError, df.drop, ['g'], 1) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
let's give at least one of these an errors='raise'
argument (and add it to an existing test that doesn't raise) just for completeness.
This all looks fine to me - thanks for the patch. We might want to reduce the number of test cases (just because we don't necessarily need to duplicate), but aside from that, quite good. - somebody else want to take a look too? I may be a bit rusty :P |
Looks all good to me. But let's wait for #6599. :) |
@sinhrks this idea is ok, can you rebase |
@jreback Yes, rebased. |
@@ -2226,7 +2228,9 @@ def drop(self, labels): | |||
indexer = self.get_indexer(labels) | |||
mask = indexer == -1 | |||
if mask.any(): | |||
raise ValueError('labels %s not contained in axis' % labels[mask]) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
hmm, shouldn't this raise a KeyError
? @jorisvandenbossche @shoyer to be consistent with say .loc
and other indexers? @hayd
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is kind of a gray area to me. We're not doing actual indexing here. I would probably stick with ValueError
, especially to remain consistent with the existing implementation.
44421ca
to
58b4a4d
Compare
have to think about this |
fc6794d
to
3fdca30
Compare
@sinhrks can you rebase this. |
Sure, rebased. |
ENH: drop function now has errors keyword for non-existing column handling
@sinhrks thanks! |
Closes #5300.
Currently
drop
raisesValueError
when non-existing label is passed. I think it is useful ifdrop
has an option to suppress error and drop existing labels only.For example, I sometimes process lots of files which has slightly different columns, and want to
drop
if data has unnecessary columns. Previously, I have to prepare differentdrop
arguments checking columns existing each data.