You might get unusual errors about Unicode and inability to convert to ASCII. Programs might just crash at random. Those are often simple to fix â all you need is correct locale configuration.
Has this ever happened to you?
Traceback (most recent call last): File "aogonek.py", line 1, in <module> print(u'\u0105') UnicodeEncodeError: 'ascii' codec can't encode character '\u0105' in position 0: ordinal not in range(128)
perl: warning: Setting locale failed. perl: warning: Please check that your locale settings: [...] are supported and installed on your system. perl: warning: Falling back to the standard locale ("C").
All those errors have the same root cause: incorrect locale configuration. To fix them all, you need to generate the missing locales and set them.
Check currently used localeÂ¶
Thelocalecommand (without arguments) should tell you which locales youâre currently using. (The list might be shorter on your end)
$ locale LANG="en_US.UTF-8" LC_CTYPE="en_US.UTF-8" LC_NUMERIC="en_US.UTF-8" LC_TIME="en_US.UTF-8" LC_COLLATE="en_US.UTF-8" LC_MONETARY="en_US.UTF-8" LC_MESSAGES="en_US.UTF-8" LC_PAPER="en_US.UTF-8" LC_NAME="en_US.UTF-8" LC_ADDRESS="en_US.UTF-8" LC_TELEPHONE="en_US.UTF-8" LC_MEASUREMENT="en_US.UTF-8" LC_IDENTIFICATION="en_US.UTF-8" LC_ALL=
If any of those is set toCorPOSIX, has a different encoding thanUTF-8(sometimes spelledutf8) is empty (with the exception ofLC_ALL), or if you see any errors, you need to reconfigure your locale.
Check locale availability and install missing localesÂ¶
The first thing you need to do is check locale availability. To do this, runlocale -a. This will produce a list of all installed locales. You can usegrepto get a more reasonable list.
$ locale -a | grep -i utf <lists all UTF-8 locales> $ locale -a | grep -i utf | grep -i en_US en_US.UTF-8
The best locale to use is the one for your language, with the UTF-8 encoding. The locale will be used by some console apps for output. Iâm going to useen_US.UTF-8in this guide.
If you canât see any UTF-8 locales, or no appropriate locale setting for your language of choice, you might need to generate those. The required actions depend on your distro/OS.
- Debian, Ubuntu, and derivatives: installlanguage-pack-en-base, runsudo dpkg-reconfigure locales
- RHEL, CentOS, Fedora: installglibc-langpack-en
- Arch Linux: uncomment relevant entries in/etc/locale.genand runsudo locale-gen(wiki)
- For other OSes, refer to the documentation.
You need a UTF-8 locale to ensure compatibility with software. Avoid theCandPOSIXlocales (itâs ASCII) and locales with other encodings (those arenât used by ~anyone these days)
On some systems, you may be able to configure locale system-wide. Check your system documentation for details. If your system has systemd, run
Configure for a single userÂ¶
If your environment does not allow system-wide locale configuration (macOS, shared server with generated but unconfigured locales), or if you want to ensure itâs always configured independently of system settings.
To do this, you need to edit the configuration file for your shell. If youâre using bash, itâs.bashrc(or.bash_profileon macOS). For zsh users,.zshrc. Add this line (or equivalent in your shell):
That should be enough. Note that those settings donât apply to programs not launched through a shell.
Python/Windows corner: Python 3.7 will fix this on Unix by assuming UTF-8 if it encounters the C locale. On Windows, Python 3.6 is using UTF-8 interactively, but not when using shell redirections to files or pipes.
This post was brought to you by Ä â U+0105 LATIN SMALL LETTER A WITH OGONEK.