Skip to content

Commit 1a77b43

Browse files
committed
BUG: Need 'windows-1252' encoding for locale names.
#24760 #23638 There are some special characters encoded with 'window-1252' in lists created by 'locale -a'. The two know locales with this problem are Norwegian 'bokmål', and 'français'.
1 parent bca39a7 commit 1a77b43

File tree

1 file changed

+11
-2
lines changed

1 file changed

+11
-2
lines changed

pandas/_config/localization.py

+11-2
Original file line numberDiff line numberDiff line change
@@ -146,10 +146,19 @@ def get_locales(prefix=None, normalize=True, locale_getter=_default_locale_gette
146146
# raw_locales is "\n" separated list of locales
147147
# it may contain non-decodable parts, so split
148148
# extract what we can and then rejoin.
149-
raw_locales = raw_locales.split(b"\n")
149+
raw_locales = raw_locales.split(b'\n')
150150
out_locales = []
151151
for x in raw_locales:
152-
out_locales.append(str(x, encoding=options.display.encoding))
152+
try:
153+
out_locales.append(str(
154+
x, encoding=options.display.encoding))
155+
except UnicodeError:
156+
# 'locale -a' is used to populated 'raw_locales' and on
157+
# Redhat 7 Linux (and maybe others) prints locale names
158+
# using windows-1252 encoding. Bug only triggered by
159+
# a few special characters and when there is an
160+
# extensive list of installed locales.
161+
out_locales.append(str(x, encoding='windows-1252'))
153162

154163
except TypeError:
155164
pass

0 commit comments

Comments
 (0)