-
-
Notifications
You must be signed in to change notification settings - Fork 18.4k
REF/CLN: pandas/io/parsers.py #36852
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
big picture: whats the goal? |
Extract some methods to make any further refactoring easier. |
pandas/io/parsers.py
Outdated
if result.get(arg, depr_default) != depr_default: | ||
depr_warning += msg + "\n\n" | ||
msg = ( | ||
f"The {repr(arg)} argument has been deprecated and will be " |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
isnt repr
here redundant?
Is there further refactoring you have in mind? |
pandas/io/parsers.py
Outdated
@@ -951,6 +903,61 @@ def __init__(self, f, engine=None, **kwds): | |||
def close(self): | |||
self._engine.close() | |||
|
|||
@staticmethod |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
don't use staticmethods
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Moved to module level
pandas/io/parsers.py
Outdated
kwds[param] = dialect_val | ||
return kwds | ||
|
||
@staticmethod |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
same
pandas/io/parsers.py
Outdated
return kwds | ||
|
||
@staticmethod | ||
def _validate_skipfooter(kwds): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can you type
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
pandas/io/parsers.py
Outdated
@@ -951,6 +903,61 @@ def __init__(self, f, engine=None, **kwds): | |||
def close(self): | |||
self._engine.close() | |||
|
|||
@staticmethod | |||
def _refresh_kwargs_based_on_dialect(kwds, dialect): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can you type and add a doc-string
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
pandas/io/parsers.py
Outdated
self._engine_specified = kwds.get("engine_specified", engine_specified) | ||
|
||
if kwds.get("dialect") is not None: | ||
dialect = kwds["dialect"] | ||
if dialect in csv.list_dialects(): | ||
dialect = csv.get_dialect(dialect) | ||
kwds = self._refresh_kwargs_based_on_dialect(kwds, dialect) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
merge master as some of this updated; and rename this like check_defaults_dialect (or similar)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
I renamed the function to _merge_with_dialect_properties
as it actually merges kwargs with those provided by dialect (and resolve some conflicts).
I am still trying to figure out what the code does. By making small PRs on refactoring/cleanup I get to understand it better. Probably later I will have some suggestion on the refactoring. |
@@ -3791,3 +3735,93 @@ def _check_defaults_read( | |||
kwds["engine_specified"] = False | |||
|
|||
return kwds | |||
|
|||
|
|||
def _merge_with_dialect_properties( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
reverse these (always put kwds things last).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
rename kwds -> defaults
so similar to the _check_defaults_read
also not averse to renaming _check_defaults_read to something better.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have done it.
Also, I found the issue with the ValueError message, when dialect is incorrect.
Originally, kwds['dialect']
, but I decided to use dialect.__name__
for better readability, which is not accepted by mypy (throws error "Dialect" has no attribute "__name__" [attr-defined]
).
So, now I use string representation of dialect instance, however I can still pass one extra parameter (dialect_name) to _merge_with_dialect_properties
and use it in the error message.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lgtm. cc @gfyoung if any comments, pls merge when ready.
cc @twoertwein if any comments. |
LGTM, outsourcing the dialect validation makes the constructor of |
Thanks @ivanovmg! |
black pandas
git diff upstream/master -u -- "*.py" | flake8 --diff
Refactor/cleanup
pandas/io/parsers.py