-
-
Notifications
You must be signed in to change notification settings - Fork 18.4k
Conform Series.to_csv to DataFrame.to_csv #19745
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Codecov Report
@@ Coverage Diff @@
## master #19745 +/- ##
==========================================
+ Coverage 91.95% 91.95% +<.01%
==========================================
Files 166 166
Lines 50274 50290 +16
==========================================
+ Hits 46231 46246 +15
- Misses 4043 4044 +1
Continue to review full report at Codecov.
|
I realised that there's a decorator for deprecating keyword arguments and did my best to use that instead. I also tried to add versionadded and deprecated directives to the docstring by mimicking the way they are used in |
pandas/core/series.py
Outdated
float_format=None, header=False, index_label=None, | ||
mode='w', encoding=None, compression=None, date_format=None, | ||
decimal='.'): | ||
decimal='.', **kwargs): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
don't pass kwargs. rather list all of the args.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why not passing kwargs
? Otherwise each time we add something to to_csv
, we will forget to add it here. We can just clearly say in the docstring here: "if you want to see all params, check DataFrame.to_csv"
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this is not a good pattern
much better to have a consistent signature
defined
very unlikely to add new keywords anyhow
shared docs are the way to do this
and follow our pattern
kwargs are not
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As @dahlbaek said, and it might be a pity, but the kwarg order is not equal, so we can't just share. I agree with you in general, but I don't think it is worth to put much effort in this case to make it work (deprecate positional keywords etc)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
let’s just fix the args and conform to DataFrame.to_csv
otherwise this will go on forever
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just changing the order is a backwards incompatible change, and needs to be deprecated first.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
of course but it’s better to have a 1 time break
than let this go on forever
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- Is there a standard way to deprecate ordering of positional arguments?
- Is there a standard way to deprecate default value of an argument?
- Would it be ok to go for a one argument per line approach, as is done in pandas.read_excel? I find this style much more readable, and changes are easier to track too.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think there is a similar example currently in the codebase. But I think doing a *args, **kwargs
as you proposed in the top post, is the easiest way to go. You can then introspect the args
, assign them manually to the correct keywords, and raise a DeprecationWarning if needed.
pandas/core/series.py
Outdated
@@ -2878,16 +2878,22 @@ def from_csv(cls, path, sep=',', parse_dates=True, header=None, | |||
|
|||
return result | |||
|
|||
def to_csv(self, path=None, index=True, sep=",", na_rep='', | |||
@deprecate_kwarg(old_arg_name='path', new_arg_name='path_or_buf') | |||
def to_csv(self, path_or_buf=None, index=True, sep=",", na_rep='', |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ideally we should have a shared doc-string defined in generic.py (parmeterized by klass) to avoid repeating all of it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there a best-practices example for this that I could use for guidance?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
there are lots of examples virtually every function in generic, e.g. .reindex
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it will be easier to wait with moving to generic.py until deprecation of positional args is done
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
deprecating positional parameters is pretty silly here and will be way too messy. just conform the signature, write a nice message in the whatsnew and let's call it a day.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is perfectly possible to do this (deprecating positional arguments), so why not do that? IMO there is no reason in possibly breaking user's code if we can easily prevent it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
go for it then. this is going to be a big mess.
xref #18958 |
ideally we would consolidate the tests with DataFrame.to_csv to ensure coverage of all of the kwargs. could be in this PR or followon |
@dahlbaek can you or @pandas-dev/pandas-core have a look here |
@jreback @jorisvandenbossche : Maybe a silly question, but couldn't we just implement |
yes |
Alrighty then. I'll create a new PR to supersede this one. |
... but wouldn't the deprecation message look like " (and in that case, wouldn't we better immediately apply the solution above, with deprecation warnings for the three problematic keywords?) |
I've been meaning to give this a second go for a while, but haven't had time. Clearly, I'm not as experienced as others in this thread regarding the pandas codebase, but maybe someone will find some of my thoughts interesting anyway:
If a decision is made as to how to proceed, I would be happy to lend a hand. Unfortunately, I do not currently have time to go through all the steps for a "clean" solution on my own. EDIT: Reading through my "clean" solution, I have to admit it does not look too clean. Here is an alternative: Move everything to |
@dahlbaek : Your point regarding |
Warns about aligning Series.to_csv's signature with that of DataFrame.to_csv's. In anticipation, we have moved DataFrame.to_csv to generic.py so that we can later delete the Series.to_csv implementation, and allow it to automatically adopt DataFrame's to_csv due to inheritance. Closes pandas-devgh-19745.
Warns about aligning Series.to_csv's signature with that of DataFrame.to_csv's. In anticipation, we have moved DataFrame.to_csv to generic.py so that we can later delete the Series.to_csv implementation, and allow it to automatically adopt DataFrame's to_csv due to inheritance. Closes pandas-devgh-19745.
Warns about aligning Series.to_csv's signature with that of DataFrame.to_csv's. In anticipation, we have moved DataFrame.to_csv to generic.py so that we can later delete the Series.to_csv implementation, and allow it to automatically adopt DataFrame's to_csv due to inheritance. Closes pandas-devgh-19745.
Warns about aligning Series.to_csv's signature with that of DataFrame.to_csv's. In anticipation, we have moved DataFrame.to_csv to generic.py so that we can later delete the Series.to_csv implementation, and allow it to automatically adopt DataFrame's to_csv due to inheritance. Closes pandas-devgh-19745.
Warns about aligning Series.to_csv's signature with that of DataFrame.to_csv's. In anticipation, we have moved DataFrame.to_csv to generic.py so that we can later delete the Series.to_csv implementation, and allow it to automatically adopt DataFrame's to_csv due to inheritance. Closes pandas-devgh-19745.
Warns about aligning Series.to_csv's signature with that of DataFrame.to_csv's. In anticipation, we have moved DataFrame.to_csv to generic.py so that we can later delete the Series.to_csv implementation, and allow it to automatically adopt DataFrame's to_csv due to inheritance. Closes pandas-devgh-19745.
Warns about aligning Series.to_csv's signature with that of DataFrame.to_csv's. In anticipation, we have moved DataFrame.to_csv to generic.py so that we can later delete the Series.to_csv implementation, and allow it to automatically adopt DataFrame's to_csv due to inheritance. Closes pandas-devgh-19745.
e1fb1a8
to
59ccebd
Compare
Hello @dahlbaek! Thanks for updating the PR. Cheers ! There are no PEP8 issues in this Pull Request. 🍻 Comment last updated on July 13, 2018 at 22:24 Hours UTC |
59ccebd
to
9bac25c
Compare
So, I figured I would do a proof of concept with a decorator-based approach. If anyone finds this appropriate, I will fill in tests based on the work by @gfyoung in #21868. Pros:
Cons:
(EDIT: Changed pros and cons according to comments by @toobaz) |
As correctly pointed out by @gfyoung , this is already true for |
This is a consequence of your specific implementation, not of the fact that it's contained in a decorator. |
No, that comment has to do with what should happen at the end of the deprecation period. The way I implemented things here you would not be able to simply remove the decorator, because I want to keep the signature of
That is correct. |
9bac25c
to
26f7eea
Compare
- Deprecate signature of Series.to_csv. - Move DataFrame.to_csv to generic.py
26f7eea
to
22a3dca
Compare
Among the "Cons" we should add "users who use positional arguments [and who are OK with the warning] cannot switch to the new ordering until the end of the deprecation period - and then they must immediately when upgrading", right? I consider this more relevant than the warning. (Compare with #21896 ) |
Agreed and added. This solution is not friendly towards users that want to pass positional arguments (other than One possibility would be to combine the decorator approach with the approach of #21896. That way, one could have both the intact signature and a solution which works for users that like to use positional arguments. |
We're pushing to merge #21896 over this one. |
(EDIT: Most of this comment is related to an old attempt. See this comment and below for the content of the current PR)
I tried my best to keep everything backwards compatible. A simpler solution
would have been simply
Unfortunately, such a solution breaks backwards compatibility for the following
reasons:
path_or_buf
ofDataFrame.to_csv
corresponds to the differently named argument
path
ofSeries.to_csv
.DataFrame.to_csv
does not agreefully with the ordering of positional arguments for
Series.to_csv
header
is True forDataFrame.to_csv
but False forSeries.to_csv
Maybe this is something that could be implemented in the long run?
git diff upstream/master -u -- "*.py" | flake8 --diff