-
-
Notifications
You must be signed in to change notification settings - Fork 18.4k
Support more styles for xlsxwriter #16149
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
a3b2a87
to
3f687c4
Compare
There is a consistent test failure, but not one I've managed to replicate locally. |
Codecov Report
@@ Coverage Diff @@
## master #16149 +/- ##
==========================================
- Coverage 90.83% 90.83% -0.01%
==========================================
Files 159 159
Lines 50796 50809 +13
==========================================
+ Hits 46143 46153 +10
- Misses 4653 4656 +3
Continue to review full report at Codecov.
|
Codecov Report
@@ Coverage Diff @@
## master #16149 +/- ##
==========================================
- Coverage 91.24% 91.21% -0.03%
==========================================
Files 163 163
Lines 50091 50106 +15
==========================================
+ Hits 45704 45706 +2
- Misses 4387 4400 +13
Continue to review full report at Codecov.
|
Finally passing tests after identifying a recently fixed openpyxl bug that got in the way |
fd4050e
to
96ab259
Compare
pandas/io/excel.py
Outdated
|
||
if num_format_str is not None: | ||
xl_format.set_num_format(num_format_str) | ||
props['num_format'] = num_format_str | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can we move more of this logic out of here an into the formats dir somewhere?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you mean explicitly moving the number format logic out? Yes, perhaps that's a worthwhile refactoring. I think we should also be calculating the number format from the display.precision
config. For which reason, I believe all of those changes belong in a different PR.
Or are you talking about moving this mapping logic out? Well currently we assume nested style dicts as an interchange format, which are well-suited to openpyxl but need conversion for all writers. The stuff in formats/ should remain relatively writer-agnostic.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ideally all of this logic would just be a single call here and the logic elsewhere.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think I've grokked your vision, given that this is writer specific. Do you mean that there should be more refactoring across writers? Except for this number formatting, it's already quite factored, as they each have different syntaxes for creating and formatting cells.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes i think think excel should be refactored into a subdir of writer code and style things should live there
maybe make an issue about this
it's a bit of work to split it then adding things like style should be easy
I'm happy to make an issue aiming to refactor excel writing code. But how do you feel about this PR? |
can you rebase. |
I've moved the what's new to 0.20. |
pandas/io/excel.py
Outdated
style_dict = style_dict.copy() | ||
style_dict['border'] = style_dict.pop('borders') | ||
|
||
for src, dst in self.STYLE_MAPPING: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
so this only is triggered if there is styling (IOW this won't cause a perf issue for 'regular' excel)?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A few lines above we return if style_dict is None
; a few lines above that we return if num_format_str is None and style_dict is None
. I think that is sufficient.
Btw, I think even the default to_excel
has some styling of headers, so this function will always be called, but will be returned early where possible.
There are ways to make this faster, though:
- store
STYLE_MAPPING
as a trie and descend recursively only where a prefix is matched. - flatten
style_dict
and storeSTYLE_MAPPING
as a dict so that their keys match. But to be deterministic in case of multiple competing styles,STYLE_MAPPING
would need to store the matched index, and the results would need to be sorted.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm pushing a faster variant.
pandas/io/excel.py
Outdated
@@ -1609,6 +1609,68 @@ def write_cells(self, cells, sheet_name=None, startrow=0, startcol=0, | |||
startcol + cell.col, | |||
val, style) | |||
|
|||
# Map from openpyxl-oriented styles to flatter xlsxwriter representation |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the code would be simpler to make this style formatting into a separate class (rather than have it live in functions sitting in the main excel code). can you refactor to make this cleaner.
this looks reasonable, can you rebase |
Yes, sorry I've not managed to do the refactoring you'd like to see. I've been unsure what you would like, and have had my attentions elsewhere. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
AppVeyor failure looks like someone else's problem.
pandas/io/excel.py
Outdated
style_dict = style_dict.copy() | ||
style_dict['border'] = style_dict.pop('borders') | ||
|
||
for src, dst in self.STYLE_MAPPING: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A few lines above we return if style_dict is None
; a few lines above that we return if num_format_str is None and style_dict is None
. I think that is sufficient.
Btw, I think even the default to_excel
has some styling of headers, so this function will always be called, but will be returned early where possible.
There are ways to make this faster, though:
- store
STYLE_MAPPING
as a trie and descend recursively only where a prefix is matched. - flatten
style_dict
and storeSTYLE_MAPPING
as a dict so that their keys match. But to be deterministic in case of multiple competing styles,STYLE_MAPPING
would need to store the matched index, and the results would need to be sorted.
pandas/io/excel.py
Outdated
style_dict = style_dict.copy() | ||
style_dict['border'] = style_dict.pop('borders') | ||
|
||
for src, dst in self.STYLE_MAPPING: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm pushing a faster variant.
Merge into 0.21 and avoid future what's new conflicts? |
you need to rebase and some comments to respond |
Well, your comments suggested a refactor, and I've previously commented that I do not understand your vision of a refactor in this space, and certainly have not found capacity within my other FOSS commitments to to do a general code quality improvement here. Since those comments, you announced "this looks reasonable". I'll have another very brief look at refactoring. |
I made the requested change as I interpret it. I do not find it improves code quality. |
I haven't understood the cause of the travis failure, apparently in lint.sh but without error message that I can see. I've merged in master and moved what's new to the next version. |
you can check this locally via:
|
Okay. I tried to run ci/lint.sh but got not flags. Thanks.
…On 30 October 2017 at 10:25, Jeff Reback ***@***.***> wrote:
Linting *.py
pandas/io/excel.py:1794:1: E305 expected 2 blank lines after class or
function definition, found 1
you can check this locally via:
make lint-diff (or just directly run flake8 on that file)
—
You are receiving this because you authored the thread.
Reply to this email directly, view it on GitHub
<#16149 (comment)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/AAEz6yq4SKK71rbuRtXNVmBX6vxySOpjks5sxQllgaJpZM4NJIfp>
.
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The clipboard failure on travis looks unrelated.
All good @jreback?
@@ -22,7 +22,7 @@ New features | |||
Other Enhancements | |||
^^^^^^^^^^^^^^^^^^ | |||
|
|||
- | |||
- Better support for ``Dataframe.style.to_excel()`` output with the ``xlsxwriter`` engine. (:issue:`16149`) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ok for now. it might make sense to enhance the excel docs with what is better now? (or maybe a listing of styles that work). but for a followup.
thanks @jnothman |
I was surprised to find that despite the interchangeable representation of Excel styles, xlsxwriter did not have good style support.
I've not added direct tests for this functionality, but test some of it through
test_styler_to_excel
.closes #xxxxgit diff upstream/master --name-only -- '*.py' | flake8 --diff