Skip to content

Commit 707c3ef

Browse files
committed
BUG: Need 'windows-1252' encoding for locale names.
#24760 #23638 There are some special characters encoded with 'window-1252' in lists created by 'locale -a'. The two know locales with this problem are Norwegian 'bokmål', and 'français'.
1 parent 2b28454 commit 707c3ef

File tree

1 file changed

+11
-2
lines changed

1 file changed

+11
-2
lines changed

pandas/_config/localization.py

+11-2
Original file line numberDiff line numberDiff line change
@@ -142,10 +142,19 @@ def get_locales(prefix=None, normalize=True, locale_getter=_default_locale_gette
142142
# raw_locales is "\n" separated list of locales
143143
# it may contain non-decodable parts, so split
144144
# extract what we can and then rejoin.
145-
raw_locales = raw_locales.split(b"\n")
145+
raw_locales = raw_locales.split(b'\n')
146146
out_locales = []
147147
for x in raw_locales:
148-
out_locales.append(str(x, encoding=options.display.encoding))
148+
try:
149+
out_locales.append(str(
150+
x, encoding=options.display.encoding))
151+
except UnicodeError:
152+
# 'locale -a' is used to populated 'raw_locales' and on
153+
# Redhat 7 Linux (and maybe others) prints locale names
154+
# using windows-1252 encoding. Bug only triggered by
155+
# a few special characters and when there is an
156+
# extensive list of installed locales.
157+
out_locales.append(str(x, encoding='windows-1252'))
149158

150159
except TypeError:
151160
pass

0 commit comments

Comments
 (0)