0

I tried to use rgrep to search in cyrillic files with UTF-8 encoding (M-x rgrep command with search query бонус), the resulting command name was

/usr/bin/find . -type d ( -path */SCCS -o -path */RCS -o -path */CVS -o -path */MCVS -o -path */.src -o -path */.svn -o -path */.git -o -path */.hg -o -path */.bzr -o -path */_MTN -o -path */_darcs -o -path */{arch} ) -prune -o ! -type d ( -name .#* -o -name *.o -o -name *\~ -o -name *.bin -o -name *.lbin -o -name *.so -o -name *.a -o -name *.ln -o -name *.blg -o -name *.bbl -o -name *.elc -o -name *.lof -o -name *.glo -o -name *.idx -o -name *.lot -o -name *.fmt -o -name *.tfm -o -name *.class -o -name *.fas -o -name *.lib -o -name *.mem -o -name *.x86f -o -name *.sparcf -o -name *.dfsl -o -name *.pfsl -o -name *.d64fsl -o -name *.p64fsl -o -name *.lx64fsl -o -name *.lx32fsl -o -name *.dx64fsl -o -name *.dx32fsl -o -name *.fx64fsl -o -name *.fx32fsl -o -name *.sx64fsl -o -name *.sx32fsl -o -name *.wx64fsl -o -name *.wx32fsl -o -name *.fasl -o -name *.ufsl -o -name *.fsl -o -name *.dxl -o -name *.lo -o -name *.la -o -name *.gmo -o -name *.mo -o -name *.toc -o -name *.aux -o -name *.cp -o -name *.fn -o -name *.ky -o -name *.pg -o -name *.tp -o -name *.vr -o -name *.cps -o -name *.fns -o -name *.kys -o -name *.pgs -o -name *.tps -o -name *.vrs -o -name *.pyc -o -name *.pyo ) -prune -o -type f ( -iname *.org ) -exec grep --color -i -nH -e \б\о\н\у\с {} +

As you can see the search term was quoted with slashes \б\о\н\у\с, and inspite of -i option to grep the result was case-sensitive, e.g. lines with бонус were found, but Бонус or БОНУС were not. Of course I can use regexp in search query, but I'd like to avoid this for convenience reason.

How to make such a search really case-insensitive?

I use GNU Emacs 25.3.1 (x86_64-unknown-cygwin) of 2017-09-12 (emacs-w32.exe from cygwin installation) on Windows 7. BTW, the same case-sensitive results I get when use grep -i бонус *.org in shell under emacs, while in cygwin shell the result is correct.

UPD: Execution of locale command in emacs shell result in the following:

sh-4.4$ locale
LANG=
LC_CTYPE="C.UTF-8"
LC_NUMERIC="C.UTF-8"
LC_TIME="C.UTF-8"
LC_COLLATE="C.UTF-8"
LC_MONETARY="C.UTF-8"
LC_MESSAGES="C.UTF-8"
LC_ALL=

It seems my locale settings are broken...

zeliboba
  • 113
  • 4
  • While the question is useful, and I do not know the answer : but this is not an emacs question. Do you see the same behaviour directly running your grep in command line ? Maybe you can try the cygwin command line. Cygwin , windows, or grep experts would be better placed to answer it. – Jeeves Aug 14 '18 at 13:20
  • @Jeeves, I believe it is an emacs question, emacs quotes the query. – zeliboba Aug 14 '18 at 14:30
  • 1
    @zeliboba it could be an Emacs question. If you tell us if it works fine in cygwin shell, we would know. – Jeeves Aug 14 '18 at 17:53

1 Answers1

1

The C locale is a backwards-compatible special case for legacy 8-bit systems and looks suspicious here; try set a proper locale like en_us.utf-8. (I don't particularly like or recommend the US part but if that fixes it, we can figure out a better one for you. Maybe try ru.utf-8 too.)

tripleee
  • 251
  • 5
  • 15
  • A brief explanation about my locale settings: the system locale in windows is set to English (UK), and locale enviroment variables `LC_*` for cygwin were not set system-wide, they are set in .bash_profile. Emacs was running from shortcut with command line `C:\cygwin\bin\run.exe /usr/bin/emacs` and therefore it did not pickup these variables and locale for Emacs was not correct. Changing the command line to `C:\cygwin\bin\run.exe bash -l -c /usr/bin/emacs` solved the problem. – zeliboba Aug 15 '18 at 08:28
  • One may need to add `--wait` option to `run.exe` in the command above, but this is purely cygwin stuff. – zeliboba Aug 15 '18 at 08:53
  • I also encounter similar locale issue on MacOS, just for a reference this issue discussed here https://emacs.stackexchange.com/questions/10822/locale-when-launching-emacs-app-on-os-x – zeliboba May 12 '19 at 10:14
  • The broader question of how to properly set your locale for GUI applications in your OS of choice (or, in the case of Windows, probably unfortunate necessity more than preference) is not Emacs-specific. – tripleee May 12 '19 at 11:04