Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
BUG: Fixed Unicode decoding error in
Period.strftime
when a locale-specific directive is used #46405New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
BUG: Fixed Unicode decoding error in
Period.strftime
when a locale-specific directive is used #46405Changes from 17 commits
d177852
c09b76d
9dff153
2fa0736
28a0921
6b85fb2
409d196
4d09db9
3c1b80c
4c687a8
e01e051
a535249
b4ad6dd
0eafd80
b721238
cebed78
a25f3a9
d175c36
184e480
e5f3062
22b0ae4
bbaa0c1
f6857d6
60c9e69
db0e3cc
06a9567
411ccb7
e5ab44e
fe35aac
0790e72
d6914f4
22c11e7
236a2a0
befa0a0
0105ac9
a1f3773
0fcea6c
7e18375
a1c6d83
78ac13d
3fc0e25
9ad0ee1
406d304
bc1b222
fcd2ce2
b69b074
65c3a1d
e5bca9f
fbfefec
b64548b
4d026cf
cf3be80
8f65bd9
7b414a9
cb43fd5
59b4fe3
4950e06
14f854b
90c31cb
5d467e2
File filter
Filter by extension
Conversations
Jump to
There are no files selected for viewing
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does this mean the
Locale: it_IT.utf8
isnt working either?There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It depends what you mean by "working". Do you refer to the bug #46319 or to something more general ?
UnicodeError
when usingPeriod.strftime
with non-utf8 locale #46319 happens only when the current locale uses a non-unicode encoding (no utf-8). So it does not happen in any of the workers before this PRzh_CN
locale, that has a non-utf-8 encoding.locale.setlocale
was raising an error (locale not available)sudo locale-gen <locale>
Does that answer your question ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it was assumed setting the environment variables
would ensure the locale for the test run would not be the default (en_US.UTF-8) for the other locale builds.
So IIUC, this is necessary if setting a non-utf8 locale?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You are right, this is what I thought too.
Unfortunately when I tried setting
LANG
andLC_ALL
tozh_CN
(non-utf8), then there was a strange segmentation fault appearing (see #46405 (comment))So this is why I now do not set it through these env variables but I only install it.
Maybe the problem is that when the env variables are set, many side effects happen in the build chain, that I do not quite understand.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we're setting the locale this way, can we remove setting the
LANG
andLC_ALL
environment variables here?There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Honestly I have no idea :) At least for the tests in this PR, yes these env variables are not used.
However these env vars are probably related to other locale-related tests, including being able to compile pandas on alternate locale, etc. It seems that in the CI those env variables are used to even add a specific line at the beginning of the init file of pandas ! (see
ci/setup_env.sh
)This seems a bit beyond the scope of this PR. However it is maybe worth keeping track of this, in a dedicated ticket.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe
What do you think ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it would be good to add
zh_CH.utf8
andzh_CH.gb2312
explicitlyThere was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, I'll add
zh_CH.utf8
.Strangely enough,
zh_CN.gb2312
did not seem to be working, butzh_CN
does involve gb2312 .There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I will also add
it_IT
by the way, so that both locales have their utf8 AND non-utf8 encoding activated in the testThere was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done