Skip to content

ENH: drop function now has errors keyword for non-existing column handling #6736

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Apr 8, 2015

Conversation

sinhrks
Copy link
Member

@sinhrks sinhrks commented Mar 29, 2014

Closes #5300.

Currently drop raises ValueError when non-existing label is passed. I think it is useful if drop has an option to suppress error and drop existing labels only.

For example, I sometimes process lots of files which has slightly different columns, and want to drop if data has unnecessary columns. Previously, I have to prepare different drop arguments checking columns existing each data.

@jreback
Copy link
Contributor

jreback commented Mar 29, 2014

this could be added on after #6599
@hayd

@@ -6762,6 +6762,26 @@ def test_drop_names(self):
self.assertEqual(obj.columns.name, 'second')
self.assertEqual(list(df.columns), ['d', 'e', 'f'])

self.assertRaises(ValueError, df.drop, ['g'])
self.assertRaises(ValueError, df.drop, ['g'], 1)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

let's give at least one of these an errors='raise' argument (and add it to an existing test that doesn't raise) just for completeness.

@jtratner
Copy link
Contributor

This all looks fine to me - thanks for the patch.

We might want to reduce the number of test cases (just because we don't necessarily need to duplicate), but aside from that, quite good. - somebody else want to take a look too? I may be a bit rusty :P

@hayd
Copy link
Contributor

hayd commented Mar 30, 2014

Looks all good to me. But let's wait for #6599. :)

@jreback jreback added this to the 0.14.0 milestone Mar 30, 2014
@jreback
Copy link
Contributor

jreback commented Mar 30, 2014

yep....so @hayd will revisit after that #6599 merge

@jreback
Copy link
Contributor

jreback commented Jan 25, 2015

@sinhrks this idea is ok, can you rebase

@sinhrks
Copy link
Member Author

sinhrks commented Jan 26, 2015

@jreback Yes, rebased.

@@ -2226,7 +2228,9 @@ def drop(self, labels):
indexer = self.get_indexer(labels)
mask = indexer == -1
if mask.any():
raise ValueError('labels %s not contained in axis' % labels[mask])
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hmm, shouldn't this raise a KeyError? @jorisvandenbossche @shoyer to be consistent with say .loc and other indexers? @hayd

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is kind of a gray area to me. We're not doing actual indexing here. I would probably stick with ValueError, especially to remain consistent with the existing implementation.

@sinhrks sinhrks force-pushed the drop branch 2 times, most recently from 44421ca to 58b4a4d Compare February 15, 2015 03:29
@jreback jreback modified the milestones: 0.16.1, 0.16.0 Mar 5, 2015
@jreback
Copy link
Contributor

jreback commented Mar 5, 2015

have to think about this

@sinhrks sinhrks force-pushed the drop branch 2 times, most recently from fc6794d to 3fdca30 Compare April 1, 2015 12:05
@jreback
Copy link
Contributor

jreback commented Apr 4, 2015

@sinhrks can you rebase this.

@sinhrks
Copy link
Member Author

sinhrks commented Apr 4, 2015

Sure, rebased.

jreback added a commit that referenced this pull request Apr 8, 2015
ENH: drop function now has errors keyword for non-existing column handling
@jreback jreback merged commit a4ae0cf into pandas-dev:master Apr 8, 2015
@jreback
Copy link
Contributor

jreback commented Apr 8, 2015

@sinhrks thanks!

@sinhrks sinhrks deleted the drop branch April 11, 2015 13:09
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
API Design Indexing Related to indexing on series/frames, not to indexes themselves
Projects
None yet
Development

Successfully merging this pull request may close these issues.

More tolerant dataframe drop method for multiple columns deletion
5 participants