Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
ENH: Add check for character limmit #57103
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ENH: Add check for character limmit #57103
Changes from 6 commits
20e357a
74e302f
74f4f81
d864912
babdf29
4768089
65284bb
41cad2b
5470464
2e79c61
91a4bee
62b8b54
bd6a975
File filter
Filter by extension
Conversations
Jump to
There are no files selected for viewing
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This can potentially come from other data types, "String" may end up being confusing if that's the case. I think a more generic "Cell contents too long" is better.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have a quick question regarding this: should we also explore validating other data types, like float or int? The current code seems to focus solely on validating the length for specific data types. Do we need to extend these checks to other types mentioned above? Alternatively, have we conducted similar checks elsewhere, or is there a best practice that should be adhered to in this context?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think Excel's limitation is specifically for strings.
When you say elsewhere, are you thinking of outside the excel code? I do not know of any formal best practice.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I believe I grasp your perspective. I want to ensure we deliberate on whether we should also account for the length of other data types. For now, let's concentrate on checking the length specifically for strings in this pull request; that should suffice.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What extra warning is raised here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
From commit babdf29,
the check https://github.com/pandas-dev/pandas/actions/runs/7678882560/job/20929191559 rise
pandas/tests/io/excel/test_writers.py::TestExcelWriter::test_to_excel_raising_warning_when_cell_character_exceed_limit[openpyxl-.xlsx] - AssertionError: Caused unexpected warning(s): [('EncodingWarning', EncodingWarning("'encoding' argument not specified"), '/home/runner/micromamba/envs/test/lib/python3.10/site-packages/openpyxl/writer/excel.py', 199)]
and check https://github.com/pandas-dev/pandas/actions/runs/7678882560/job/20929192274 rasie
pandas/tests/io/excel/test_writers.py::TestExcelWriter::test_to_excel_raising_warning_when_cell_character_exceed_limit[openpyxl-.xlsm] - AssertionError: Caused unexpected warning(s): [('DeprecationWarning', DeprecationWarning('datetime.datetime.utcnow() is deprecated and scheduled for removal in a future version. Use timezone-aware objects to represent datetimes in UTC: datetime.datetime.now(datetime.UTC).'), '/home/runner/micromamba/envs/test/lib/python3.12/site-packages/openpyxl/packaging/core.py', 99), ('DeprecationWarning', DeprecationWarning('datetime.datetime.utcnow() is deprecated and scheduled for removal in a future version. Use timezone-aware objects to represent datetimes in UTC: datetime.datetime.now(datetime.UTC).'), '/home/runner/micromamba/envs/test/lib/python3.12/site-packages/openpyxl/writer/excel.py', 292)]
pandas/tests/io/excel/test_writers.py::TestExcelWriter::test_to_excel_raising_warning_when_cell_character_exceed_limit
is what we added in this pr. I just wrote the test following the other's in same file. If there is a better way to avoid these extra warnings, I am very happy to change codes!There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Gotcha. Yeah since these are external warnings this is fine for now