Or, why does sort(1) order differently on macOS and Linux?
Zhiming Wang
2020-06-03
Today I noticed something interesting while working with a sorted list of package names: sort(1) orders them differently on macOS and Linux (Ubuntu 20.04). A very simple example, with locale set explicitly:
(macOS) $ LC_ALL=en_US.UTF-8 sort <<<$'python-dev
python3-dev' python-dev python3-dev (Linux) $ LC_ALL=en_US.UTF-8 sort <<<$'python-dev
python3-dev' python3-dev python-dev
What the hell? Same locale, different order (or technically, collation). This is not even a difference between GNU and BSD userland; coreutils sort on macOS produces the same output as /usr/bin/sort . (Of course, when LC_ALL=C is used, the results are the same, matching the macOS result above, since “ - ” as 0x2D on the ASCII table comes before “ 3 ” as 0x33 .) Therefore, the locale itself becomes the prime suspect.
macOS
LC_COLLATE for any locale on macOS is very easy to find: just look under /usr/share/locale/
... continue reading