6

To make all letters a lower case except the first letter. The first letter would look like "Uppercase" after I changed (from UPPERCASE in cyrillic). The rest (not UPPERCASED) leave unchanged.

I'm sorry it's in cyrillic. e.g. АБРАЗИЯ Абразия

I issued the correct general command:

:%s/\<\u\zs\u*/\L&/g

And it didn't work

My linux is Gentoo, my locale is echo $LANG en_US.UTF-8.

I tried also:

 %s/\<[А-Я]\zs\[А-Я][а-я]*...

I don't know how to use properly this syntax. I guess it might work.

I don't get it, even after

:se noic /[[:upper:]] 

doesn't work. Must be a locale thing (I wonder).

sed -n '322p' geod.txt | cut -f 1 -d " " 
АВГИТИТ—
sed -n '322p' geod.txt | cut -f 1 -d " " | xxd
0000000: d090 d092 d093 d098 d0a2 d098 d0a2 e280

Though all letters are the same magnitude of the Unicode numbering.

I've rechecked again:

file -bi geod.txt
text/plain; charset=utf-8

So it's allright with utf-8 (though "file" could go wrong).

Here's my source file: http://bpaste.net/show/140967/

Xsi
  • 545
  • If anyone pleases: I'm accustomed to using vim and I'd like to use it respectively. It's just for I'll be able to see the results immediately. That's why the tag vim is on it's correct place. – Xsi Oct 16 '13 at 12:24

2 Answers2

5

EDIT: Since there is some confusion about if vim or sed should be used. I provide solutions for both:

Vim

The following substitution replaces the words with lowercase characters, except the first letter. Single-letter words are converted to uppercase.

:%s/\<\(\k\)\(\k*\)\>/\u\1\L\2/g

\k matches alpha-numeric characters and _. The widely used \w is equivalent to [A-Za-z0-9_] and will fail on Cyrillic letters.

The \< and \> grab the word boundaries and the parentheses group the match into the first letter and the rest, which is being retrieved using \1 and \2, respectively.

For this pattern to work you need to set up vim to use UTF-8.

set encoding=utf-8

Sed

sed 's/\b\([[:alpha:]]\)\([[:alpha:]]*\)\b/\u\1\L\2/g' <inputfile>

\b matches word boundaries in sed, the rest is the same as the vim version. (Tested on GNU sed, the character classes might not be supported in all sed versions.)

Marco
  • 33,548
  • So it's just the description why mine won't work in principle... Ok. Thx. – Xsi Oct 16 '13 at 12:05
  • Well, it doesn't work on mine either: echo "АБРАЗИЯ" | sed ':%s/(\k)(\k*)/\u\1\L\2/' АБРАЗИЯ – Xsi Oct 16 '13 at 12:14
  • For sed: Not only targeted АБВГД .. but all words in the file became uppercased. – Xsi Oct 17 '13 at 12:39
1

This can be done with regular expressions, and the existing answer covers that method perfectly well, but there is another approach.

For a single word, just move to the first letter of the word and use:

lgue

To do more than one word, you'll want to use a macro

qqlguewq

I'll break this down:

  • qq -- start recording a macro called q
  • l (that's a lowercase L) -- move one character to the right
  • gue -- turn every character lowercase (that's the gu) to the end of the current word (e)
  • w -- go to the first character of the next word
  • q -- stop recording the macro

You can call the macro with @q. You can call it nine times with 9@q, or forty-two times with 42@q. With this particular macro it's safe to call it an arbitrary number of times -- so you could use 9999@q.

Another route is a recursive macro:

qqqqqlguew@qq
  • qqq -- starts recording the q macro, then immediately stops recording, effectively blanking that register
  • @q -- calls the q macro, which is blank now, but will not be once you stop recording the macro
  • The rest of it behaves as above

When the macro hits the end of the final word in the document, it will exit (as it will for any error of that kind -- otherwise it would continue forever).

evilsoup
  • 6,807
  • 3
  • 34
  • 40
  • What's gu? "l" - it's like in movement I know. Does "gu" makes a mini-cycle, repeating l + edit from l to e? Sure some problem. I want to invoke my macro a bazillion times 99999999999^9999999 what to do? – Xsi Oct 16 '13 at 14:33
  • "@q -- calls the q macro, which is blank now, but will not be once you stop recording the macro" The recursivity was always hard nuts for me (since I tried to master The Knuth's uparrow nonation && The Conway's). – Xsi Oct 16 '13 at 14:41
  • qqqqqlguew@qq - nothing (but once), hanged a little bit my vim, ctrl-c saved the day. – Xsi Oct 16 '13 at 14:52