-
-
Notifications
You must be signed in to change notification settings - Fork 18.4k
ENH: Better error message if usecols doesn't match columns #17310
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from 1 commit
b0e102a
15d4786
841a6cc
dced1b7
5bf89a8
ba93833
8a06cee
1afb4c1
5a8a852
2209eae
93185f5
5dfccdb
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -1662,14 +1662,22 @@ def __init__(self, src, **kwds): | |
# GH 14671 | ||
if (self.usecols_dtype == 'string' and | ||
not set(usecols).issubset(self.orig_names)): | ||
raise ValueError("Usecols do not match names.") | ||
missing = [c for c in usecols if c not in self.orig_names] | ||
raise ValueError( | ||
"Usecols do not match columns, " | ||
"columns expected but not found: %s" % missing | ||
) | ||
|
||
if len(self.names) > len(usecols): | ||
self.names = [n for i, n in enumerate(self.names) | ||
if (i in usecols or n in usecols)] | ||
|
||
if len(self.names) < len(usecols): | ||
raise ValueError("Usecols do not match names.") | ||
missing = [c for c in usecols if c not in self.orig_names] | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. In this case, it should be checking for membership in |
||
raise ValueError( | ||
"Usecols do not match columns, " | ||
"columns expected but not found: %s" % missing | ||
) | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. same |
||
|
||
self._set_noconvert_columns() | ||
|
||
|
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -481,7 +481,7 @@ def test_raise_on_usecols_names_mismatch(self): | |
data = 'a,b,c,d\n1,2,3,4\n5,6,7,8' | ||
|
||
if self.engine == 'c': | ||
msg = 'Usecols do not match names' | ||
msg = "do not match columns, columns expected but not found" | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I think it would be good to have a stronger check for which columns are listed (both in the There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Also, is this non-helpful error an issue when you pass in There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Agree on the test, it'll require a minor reformat of the structure of the test - if that's not an issue happy to go ahead (unless wildcarding the regex for the arguments is sufficient). Will see if it's also an issue on the Python engine and if so will implement a fix! There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I would prefer that you explicitly put in the string which elements are missing. If that's what you're going for, then go for it! There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. @gfyoung I've just tested with the Python engine, and the error is:
Would we want the error to be consistent between both engines? If yes, would we prefer the proposed or existing format? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Your error message is better. Let's use it for both engines. |
||
else: | ||
msg = 'is not in list' | ||
|
||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
use
.format(...)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure, should we be using .format everywhere?
If yes there's a lot of places where we don't currently do that, I'd be happy to start changing some of those over too (in a seperate PR).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
#16130 (it's being worked on in chunks; feel free to pitch in)