Saturday, October 07, 2023

(Un)safe English locales

I recently spent a frustrating amount of time troubleshooting a Java application which had stopped working properly after an upgrade: Certain features of the product were broken due to the application running on a Linux server which had been configured with the "en_DK" locale.

"en_DK" had been chosen expecting to have system with messages in English, but with currency symbols etc. suitable for Denmark. This makes sense, because it's not uncommon for IT systems to provide more precise (error) messages in English, compared to a small language like Danish.

Unfortunately, there is not universal agreement about locales. Even within Linux distributions, there is not 100% consistency.

The GNU C Library (Glibc) seems to have the longst list of recognized locales, including 19 English ones. Java's list of supported locales is somewhat shorter and includes only 11 English locales. Interestingly, Java has an English locale for Malta (en_MT) which glibc does not have.

I haven't been able to find MacOS's list of locales, but forum posts suggest it has only 6 English locales: en_AU, en_CA, en_GB, en_IE, en_NZ, en_US.

Ignoring Mac for a moment, these are unsafe English locales, i.e. not supported by both glibc and Java:

LocaleCountry
en_AGAntigua and Barbuda
en_BWBotswana
en_DKDenmark
en_HKHong Kong
en_ILIsrael
en_MTMalta
en_NGNigeria
en_SCSeychelles
en_ZMZambia
en_ZWZimbabwe

On the other hand, the following locales should be safe:

LocaleCountrySafe even om Mac
en_AUAustralia🍎
en_CACanada🍎
en_GBGreat Britain🍎
en_IEIreland🍎
en_INIndia
en_NZNew Zealand🍎
en_PHPhilippines
en_SGSingapore
en_USUSA🍎
en_ZASouth Africa

Looking beyond unix/POSIX-like systems: Windows recognizes more than 100 English locale identifiers. Windows' list does invalidate the above safe-list.

For Danes, I suggest using one of the following locales:

  • da_DK (and sometimes accept poor error messages)
  • en_IE, as the Irish are sane enough to use a 24-hour date format, etc
  • C, which is the fall-back "POSIX system default" locale