-
-
Notifications
You must be signed in to change notification settings - Fork 18.4k
Not tested: Period.strftime
and PeriodIndex.strftime
with non-ascii char present in the formatting string
#46468
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
This issue was extracted from #46319 for clarity |
smarie
pushed a commit
to smarie/pandas
that referenced
this issue
Mar 22, 2022
I just realized that this bug was actually introduced in my PR #46405
|
Period.strftime
and PeriodIndex.strftime
output incorrect results when non-ascii char is present in the formatting stringPeriod.strftime
and PeriodIndex.strftime
with non-ascii char present in the formatting string
mroeschke
pushed a commit
that referenced
this issue
Sep 8, 2022
…specific directive is used (#46405) * Added test representative of #46319. Should fail on CI * Added a gha worker with non utf 8 zh_CN encoding * Attempt to fix the encoding so that locale works * Added the fix, but not using it for now, until CI is able to reproduce the issue. * Crazy idea: maybe simply removing the .utf8 modifier will use the right encoding ! * Hopefully fixing the locale not available error * Now simply generating the locale, not updating the ubuntu one * Trying to install the locale without enabling it * Stupid mistake * Testing the optional locale generator condition * Put back all runners * Added whatsnew * Now using the fix * As per code review: moved locale-switching fixture `overridden_locale` to conftest * Flake8 * Added comments on the runner * Added a non-utf8 locale in the `it_IT` runner. Added the zh_CN.utf8 locale in the tests * Improved readability of fixture `overridden_locale` as per code review * Added two comments on default encoding * Fixed #46319 by adding a new `char_to_string_locale` function in the `tslibs.util` module, able to decode char* using the current locale. * As per code review: modified the test to contain non-utf8 chars. Fixed the resulting issue. * Split the test in two for clarity * Fixed test and flake8 error. * Updated whatsnew to ref #46468 . Updated test name * Removing wrong whatsnew bullet * Nitpick on whatsnew as per code review * Fixed build error rst directive * Names incorrectly reverted in last merge commit * Fixed test_localization so that #46595 can be demonstrated on windows targets (even if today these do not run on windows targets, see #46597) * Fixed `tm.set_locale` context manager, it could error and leak when category LC_ALL was used. Fixed #46595 * Removed the fixture as per code review, and added corresponding parametrization in tests. * Dummy mod to trigger CI again * reverted dummy mod * Attempt to fix the remaining error on the numpy worker * Fixed issue in `_from_ordinal` * Added asserts to try to understand * Reverted debugging asserts and applied fix for numpy repeat from #47670. * Fixed the last issue on numpy dev: a TypeError message had changed * Code review: Removed `EXTRA_LOC` * Code review: removed commented line * Code review: reverted out of scope change * Code review: reverted out of scope change * Fixed unused import * Fixed revert mistake * Moved whatsnew to 1.6.0 * Update pandas/tests/io/parser/test_quoting.py Co-authored-by: Sylvain MARIE <[email protected]>
noatamir
pushed a commit
to noatamir/pandas
that referenced
this issue
Nov 9, 2022
…specific directive is used (pandas-dev#46405) * Added test representative of pandas-dev#46319. Should fail on CI * Added a gha worker with non utf 8 zh_CN encoding * Attempt to fix the encoding so that locale works * Added the fix, but not using it for now, until CI is able to reproduce the issue. * Crazy idea: maybe simply removing the .utf8 modifier will use the right encoding ! * Hopefully fixing the locale not available error * Now simply generating the locale, not updating the ubuntu one * Trying to install the locale without enabling it * Stupid mistake * Testing the optional locale generator condition * Put back all runners * Added whatsnew * Now using the fix * As per code review: moved locale-switching fixture `overridden_locale` to conftest * Flake8 * Added comments on the runner * Added a non-utf8 locale in the `it_IT` runner. Added the zh_CN.utf8 locale in the tests * Improved readability of fixture `overridden_locale` as per code review * Added two comments on default encoding * Fixed pandas-dev#46319 by adding a new `char_to_string_locale` function in the `tslibs.util` module, able to decode char* using the current locale. * As per code review: modified the test to contain non-utf8 chars. Fixed the resulting issue. * Split the test in two for clarity * Fixed test and flake8 error. * Updated whatsnew to ref pandas-dev#46468 . Updated test name * Removing wrong whatsnew bullet * Nitpick on whatsnew as per code review * Fixed build error rst directive * Names incorrectly reverted in last merge commit * Fixed test_localization so that pandas-dev#46595 can be demonstrated on windows targets (even if today these do not run on windows targets, see pandas-dev#46597) * Fixed `tm.set_locale` context manager, it could error and leak when category LC_ALL was used. Fixed pandas-dev#46595 * Removed the fixture as per code review, and added corresponding parametrization in tests. * Dummy mod to trigger CI again * reverted dummy mod * Attempt to fix the remaining error on the numpy worker * Fixed issue in `_from_ordinal` * Added asserts to try to understand * Reverted debugging asserts and applied fix for numpy repeat from pandas-dev#47670. * Fixed the last issue on numpy dev: a TypeError message had changed * Code review: Removed `EXTRA_LOC` * Code review: removed commented line * Code review: reverted out of scope change * Code review: reverted out of scope change * Fixed unused import * Fixed revert mistake * Moved whatsnew to 1.6.0 * Update pandas/tests/io/parser/test_quoting.py Co-authored-by: Sylvain MARIE <[email protected]>
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
EDIT : as I found out this was not a bug in the main branch, I introduced it accidentally in #46405.
Therefore I rename the ticket as "not tested", meaning that current test suite does not cover this case.
Pandas version checks
Reproducible Example
Issue Description
When there is a non-ascii character in the formatting string passed to
Period.strftime
the result may be corrupted. This is most probably related to an encoding error.Expected Behavior
The assert should pass without error
The text was updated successfully, but these errors were encountered: