-
-
Notifications
You must be signed in to change notification settings - Fork 18.4k
REF: Unify _set_noconvert_dtype_columns for parsers #39365
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
pandas/io/parsers/base_parser.py
Outdated
@@ -546,6 +548,65 @@ def _convert_to_ndarrays( | |||
print(f"Filled {na_count} NA values in column {c!s}") | |||
return result | |||
|
|||
def _set_noconvert_dtype_columns(self, col_indices, names): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can you type this in any way? (esp the return value) and add a doc-string
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
pandas/io/parsers/base_parser.py
Outdated
if self.usecols_dtype == "integer": | ||
# A set of integers will be converted to a list in | ||
# the correct order every single time. | ||
usecols = list(self.usecols) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
-> sorted
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Safe-sort, because could be mixed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
no could not, sorry. Used sorted
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lgtm. if you can try that suggestion and if it works ping (or comment that its a nogo)
pandas/io/parsers/base_parser.py
Outdated
|
||
# pandas\io\parsers.py:2030: error: Incompatible types in | ||
# assignment (expression has type "None", variable has type | ||
# "List[Any]") [assignment] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i think if you predeclare at the top
usecols: Optional[List[Any]] = []
you can remove the ignore
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yep, this works, typed it a bit more specific.
I've got #39342 for the mypy errors in there, will rebase when this is merged
parse_dates=[1], | ||
usecols=[1, 2], | ||
thousands="-", | ||
) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this is a bug yes? can you add a whatsnew note
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yep, added
@jreback green |
thanks ! |
I unified the code which led to a bugfix. The test failed for the python parser case previously