-
-
Notifications
You must be signed in to change notification settings - Fork 18.4k
DOC: update the to_excel docstring #20185
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
DOC: update the to_excel docstring #20185
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What about the docstring validtion script output? There are many things to fix there (No extended summary, no examples...)
@cmdelatorre I'm aware this is an incomplete PR and I'm sorry about that but we ran out of time on the sprint. I've added an extended summary covering the basics. |
pandas/core/generic.py
Outdated
.. versionadded:: 0.20.0 | ||
.. versionadded:: 0.20.0. | ||
|
||
Returns |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If the function doesn't return anything I think it's better to omit this section.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK
pandas/core/generic.py
Outdated
``io.excel.xlsx.writer``, ``io.excel.xls.writer``, and | ||
``io.excel.xlsm.writer``. | ||
merge_cells : boolean, default True | ||
Write MultiIndex and Hierarchical Rows as merged cells. | ||
encoding: string, default None | ||
encoding of the resulting excel file. Only necessary for xlwt, | ||
encoding : string, default 'ascii' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How did you figure this out? Reading the code of DataFrame.to_excel
it looks like this parameter is ignored!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You are correct
pandas/core/generic.py
Outdated
float_format : string, default None | ||
Format string for floating point numbers | ||
Format string for floating point numbers. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you mention some examples and/or link to where the format of this parameter is documented?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
pandas/core/generic.py
Outdated
float_format : string, default None | ||
Format string for floating point numbers | ||
Format string for floating point numbers. | ||
columns : sequence, optional |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sequence of what? I think this should be "list of str or list of int" to select columns.
Also, add "If not specified, all columns will be written".
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
BTW I think this parameter probably makes no sense for Series.to_excel
and the docstring is shared. We should find a way to not show this parameter in the docs in this case.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added list of strings
pandas/core/generic.py
Outdated
|
||
See Also | ||
-------- | ||
pd.read_excel |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
pandas.read_excel
pandas/core/generic.py
Outdated
engine : string, default None | ||
write engine to use - you can also set this via the options | ||
Write engine to use - you can also set this via the options |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Which options? I understand the string is the fully qualified import part for the write engine class. Are the ones shown there examples? Can we specify that in the docs?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It appears the options are 'openpyxl', 'xlsxwriter', 'xlwt'
, although in my testing only 'openpyxl', 'xlsxwriter',
work
Codecov Report
@@ Coverage Diff @@
## master #20185 +/- ##
==========================================
- Coverage 92% 91.82% -0.18%
==========================================
Files 170 153 -17
Lines 50609 49256 -1353
==========================================
- Hits 46561 45229 -1332
+ Misses 4048 4027 -21
Continue to review full report at Codecov.
|
|
||
Parameters | ||
---------- | ||
excel_writer : string or ExcelWriter object | ||
File path or existing ExcelWriter | ||
File path or existing ExcelWriter. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we link to the class? I believe
:class:`pandas.ExcelWriter`
should work.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also, in the extended description we say that an ExcelWriter
object is necessary to be able to use .to_excel
, but in the parameter description we say that we also accept a "file path". So the ExcelWriter
is not really needed? What happens if we specify a file path? I guess that a new ExcelWriter
gets automatically created with default options.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes this is allowed df1.to_excel("example.xlsx", sheet_name='Sheet3')
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, the ExcelWriter is not required
pandas/core/generic.py
Outdated
Write %(klass)s to an excel sheet. | ||
|
||
To write a %(klass)s to an excel .xlsx file it is necessary to first create | ||
an ExcelWriter object with a target file name, and specify a sheet in the |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since ExcelWriter is a class I believe it should be between backticks to make it itallic like
`ExcelWriter`
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ok
pandas/core/generic.py
Outdated
... columns=['col 1', 'col 2']) | ||
>>> writer = pd.ExcelWriter('output.xlsx', engine='xlsxwriter') | ||
>>> df1.to_excel(writer, sheet_name='Sheet1') | ||
>>> writer.save() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've observed this is only needed if we are creating an ExcelWriter
explicitly. If one specifies a file path, like df.to_excel('my_file.xml')
it is not needed.
Perhaps it makes sense to show first an example with a file path, which is simpler, and then an example with an ExcelWriter
where we can specify more custom options?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added
pandas/core/generic.py
Outdated
>>> writer2.save() | ||
|
||
Limit floats to a fixed precision using float_format. For example | ||
float_format="%.2f" will format 0.1234 to 0.12. | ||
|
||
For compatibility with to_csv, to_excel serializes lists and dicts to |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we add a link to the method pandas.DataFrame.to_csv
here as well?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ok
pandas/core/generic.py
Outdated
To write a %(klass)s to an excel .xlsx file it is necessary to first create | ||
an ExcelWriter object with a target file name, and specify a sheet in the | ||
file to write to. Multiple sheets may be written to by | ||
specifying unique sheet_name. With all data written to the file it is |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sheet name between backquotes since it's a variable name, see http://numpydoc.readthedocs.io/en/latest/format.html#sections
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ok
I've completely wiped by env to try and fix this error but still it persists. Means I am not able to run the validation script:
Hopefully I can fix this |
Hello @robmarkcole! Thanks for updating the PR. Cheers ! There are no PEP8 issues in this Pull Request. 🍻 Comment last updated on July 24, 2018 at 16:02 Hours UTC |
pandas/core/generic.py
Outdated
Write %(klass)s to an excel sheet. | ||
|
||
To write a %(klass)s to an excel .xlsx file it is necessary to first create | ||
an `ExcelWriter` object with a target file name, and specify a sheet in the |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As discussed, this is not required.
But I think it is needed if you want to write multiple sheets, so that would be good for in the examples.
See http://pandas-docs.github.io/pandas-docs-travis/io.html#writing-excel-files-to-disk
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Updated
pandas/core/generic.py
Outdated
See Also | ||
-------- | ||
pandas.read_excel | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you add pandas.ExcelWriter ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK
pandas/core/generic.py
Outdated
Limit floats to a fixed precision using float_format. For example | ||
float_format="%.2f" will format 0.1234 to 0.12. | ||
|
||
For compatibility with `to_csv <https://pandas.pydata.org/pandas-docs/stable/generated/pandas.DataFrame.to_csv.html>`__ , |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
you can make this :meth:`~DataFrame.to_csv`
to auto-generate the link
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK
pandas/core/generic.py
Outdated
data without rewriting the whole workbook. | ||
|
||
Limit floats to a fixed precision using float_format. For example | ||
float_format="%.2f" will format 0.1234 to 0.12. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So I think this one is generating the problem you have, because due to the %
the templating / shared_docs system thinks it has to fill in something here, which it can't ...
You can fix it by doing "%%.2f"
(and in the actual docstring it will look like a single %)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK
@robmarkcole were you able to sort out your issues? That error means that you're failing to pass a keyword to some other place that's using this shared docstring. Do you have time to finish this off? |
@TomAugspurger I want able to solve and an not sure how I will. I am on holiday this week without computer so will have to take a look next week. Sorry for the delay |
See my comment here #20185 (comment) for a solution.
No problem! |
Address comments of @jorisvandenbossche
@TomAugspurger @jorisvandenbossche sorry this fell off my radar after my holiday, have addressed your comments now. I'm still unable to run the validation script but hopefully all changes are OK |
%(versionadded_to_excel)s | ||
Write %(klass)s to an excel sheet. | ||
|
||
To write a single %(klass)s to an excel .xlsx file it is only necessary to |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this can be simplified if someone addresses #21835 with a See Also link to ExcelWriter instead of all of this detail (one or two lines may still suffice)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is a little verbose, but I wouldn't cut this down that drastically IMO.
…cel_docstring_fixes
Thanks @robmarkcole! |
Checklist for the pandas documentation sprint (ignore this if you are doing
an unrelated PR):
scripts/validate_docstrings.py <your-function-or-method>
git diff upstream/master -u -- "*.py" | flake8 --diff
python doc/make.py --single <your-function-or-method>