Skip to content

BUG: Need 'windows-1252' encoding for locale names. #27368

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 4 commits into from
Oct 12, 2019

Conversation

timcera
Copy link
Contributor

@timcera timcera commented Jul 12, 2019

#24760
#23638
There are some special characters encoded with 'window-1252' in lists
created by 'locale -a'. The two know locales with this problem are
Norwegian 'bokmål', and 'français'.

@WillAyd
Copy link
Member

WillAyd commented Jul 12, 2019

Is there any chance that using subprocess.run instead of subprocess.check_output helps here at all? We only support >=Py35 so could switch from the older API if it helps simplify

@WillAyd WillAyd added Linux Linux OS Unicode Unicode strings labels Jul 12, 2019
@timcera
Copy link
Contributor Author

timcera commented Jul 13, 2019

Is there any chance that using subprocess.run instead of subprocess.check_output helps here at all? We only support >=Py35 so could switch from the older API if it helps simplify

Neither subprocess.run nor subprocess.check_output should change or set the output encoding. The encoding should only depend on what locale -a prints. Why 'windows-1252' instead of 'utf-8'? I have no idea.

Copy link
Member

@WillAyd WillAyd left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you mock out a test for this?

@WillAyd
Copy link
Member

WillAyd commented Jul 15, 2019

Also be sure to run black pandas on your local branch before pushing (that is the current CI failure)

@WillAyd
Copy link
Member

WillAyd commented Aug 26, 2019

@timcera can you add a test and merge master?

@jreback
Copy link
Contributor

jreback commented Sep 8, 2019

this looks ok actually, @timcera can you add a note to 1.0; if you can do a test great, but not a big deal otherwise.

@jbrockmendel
Copy link
Member

@timcera can you rebase? I think the CI failure may be unrelated

pandas-dev#24760
pandas-dev#23638
There are some special characters encoded with 'window-1252' in lists
created by 'locale -a'.  The two know locales with this problem are
Norwegian 'bokmål', and 'français'.
@jreback
Copy link
Contributor

jreback commented Oct 6, 2019

this is prob ok, can you merge master and add a note to 1.0.0, ping on green.

@jreback jreback added this to the 1.0 milestone Oct 6, 2019
@timcera
Copy link
Contributor Author

timcera commented Oct 11, 2019 via email

@WillAyd
Copy link
Member

WillAyd commented Oct 11, 2019

Can you add a note to doc/source/whatsnew/v1.0.0.rst describing the change and then notify when CI is green after pushing up the change?

@timcera
Copy link
Contributor Author

timcera commented Oct 12, 2019

All checks passed.

@jreback jreback merged commit 25059ee into pandas-dev:master Oct 12, 2019
@jreback
Copy link
Contributor

jreback commented Oct 12, 2019

thanks @timcera

proost pushed a commit to proost/pandas that referenced this pull request Dec 19, 2019
proost pushed a commit to proost/pandas that referenced this pull request Dec 19, 2019
bongolegend pushed a commit to bongolegend/pandas that referenced this pull request Jan 1, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Linux Linux OS Unicode Unicode strings
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Pandas Test Scripts always fail tried 0.24.0rc1, 0.23.4, 0.23.3 and 0.23.2 pandas 0.23.4 fails unit tests
4 participants