2

I'm using Kubuntu 14.04.2 LTS. I set the locale environment with the following command:

export LANG=ru_RU.utf8 LANGUAGE=ru_RU.utf8 LC_ALL=ru_RU.utf8

so that locale now says all entries are ru_RU.utf8. But when I issue date, I get the following:

Чт. мая 14 12:55:36 MSK 2015

While it's normal to say "May 14" in English, we never say "мая 14" in Russian (it would mean "of May 14th" instead of "14th of May"). It should rather be "14 мая" or, at the worst, "май, 14".

According to info coreutils 'date invocation', in C locale the default format string is '+%a %b %e %H:%M:%S %Z %Y', and it appears exactly what I get with ru_RU.utf8 locale. But that same info page says that the format string depends on LC_TIME locale category. So, I conclude that there's a bug somewhere in system locale database.

Is it a bug indeed, or am I missing something?

Ruslan
  • 3,370
  • in German it's Do 14. Mai 12:36:08 CEST 2015 – frostschutz May 14 '15 at 10:37
  • @ruslan, I deleted my answer since, as you pointed out, my guess was wrong. What's worse, I had actually checked in Greek and failed to notice that the order changes there too: LC_ALL=el_GR.utf8 date returns Πεμ 14 Μάι. So yes, it does look like a bug in the Russian localization. – terdon May 14 '15 at 10:54

1 Answers1

6

This isn't a bug in date; it's caused by the definitions in LC_TIME. As per the info page:

Invoking date with no format argument is equivalent to invoking it
with a default format that depends on the LC_TIME locale category.

Now, if you open /usr/share/i18n/locales/ru_RU, under LC_TIME you will see that date_fmt (the date/time format) is defined as:

date_fmt       "<U0025><U0061><U0020><U0025><U0062><U0020><U0025><U0065>/
<U0020><U0025><U0048><U003A><U0025><U004D><U003A><U0025><U0053><U0020>/
<U0025><U005A><U0020><U0025><U0059>"

which translates to %a %b %e %H:%M:%S %Z %Y, that is:
%a - locale's abbreviated weekday name (e.g. Чт)
%b - locale's abbreviated month name (e.g. май)
%e - day of month, space padded; same as %_d (e.g. 14)
etc...
So, if you edit the file and swap the two conversion specifiers %b and %e (i.e. swap <U0062> and <U0065>):

date_fmt       "<U0025><U0061><U0020><U0025><U0065><U0020><U0025><U0062>/
<U0020><U0025><U0048><U003A><U0025><U004D><U003A><U0025><U0053><U0020>/
<U0025><U005A><U0020><U0025><U0059>"

and then run locale-gen, you will get the right date format:

LC_TIME=ru_RU.utf8 date
Чт 14 май 13:27:14 MSK 2015

Last revision date (as of 2015-05-14) of the above mentioned file appears to be 2013-11-14 so feel free to report a bug: bug-glibc-locales@gnu.org

don_crissti
  • 82,805
  • Yes, I've explicitly supposed in the OP that it's a bug in "system locale database", not in date. And indeed, it works correctly after swapping these chars. Seems strange though that you get "май" while on my system it's "мая". – Ruslan May 14 '15 at 11:57
  • 6
    You can query that date_fmt with locale date_fmt. See also locale -k LC_TIME – Stéphane Chazelas May 14 '15 at 11:59
  • @StéphaneChazelas thanks, that's much easier than typing the values into hex editor to see what they represent. – Ruslan May 14 '15 at 12:02
  • 1
    @Rusian, see that commitdiff for the fix on abbreviated month names and the history of that file. – Stéphane Chazelas May 14 '15 at 12:14
  • Looks like the date and revision fields of that locale file have not been updated since 2000-06-29 even though there have been many changes since. I've updated your last revision date so it reflects the time it was really revised. – Stéphane Chazelas May 14 '15 at 12:25
  • @StéphaneChazelas - No problem, thanks for the edits and for the heads-up (locale date_fmt). I just went through the links you added... weird that they don't update the date/revision fields when they change locale files... – don_crissti May 14 '15 at 12:35
  • Yes, sounds like an overlook. There are even two revision dates and numbers in there (4.3 1996, 1.0 2000)... – Stéphane Chazelas May 14 '15 at 12:52