3

According to tr(1) manual -C means:

Complement the set of characters in string1, that is ``-C ab'' includes every character except for `a' and `b'.

..and -c means:

Same as -C but complement the set of values in string1.

Now if I use -c in the above command, it works as I expect:

$ echo $(dd if=/dev/urandom count=1 2>/dev/null | tr -dc 'A-Za-z0-9')
BAP0EctPYxpGgJmWYclqHj2eBWfZvVJs7nL6Y6YQiguGoZgziCceLe3TcyeV4uUi1R1yPW98s8LgiC8iNS1F60tEE2nXAHNi6L6IVS3CXBn94oPLGppxAgp
$ 

..but -C doesn't:

$ echo $(dd if=/dev/urandom count=1 2>/dev/null | tr -dC 'A-Za-z0-9')
���hA����W���t�W��eu�C���W��o��A��xz�����M��p���x��2q����10O���������������p�R���t��I���c�8Z��Rq�9�L�Z��u����ot�n�T��n�nI��3i�yj�CuK��v�Ny�0�������i1�W�Lo�do�����TckL����i�rn��Wc��T���3����X��Z�M�e���I��J��I���A�5Y�����h���K���������ai������S����aZ�G���oab8��������4�g���G��g��0����I���H2�XGo���1�7���Ls�9H��7�b���Sf���E��Tv����mE�����3���l���S�88z��nl�p�f����w�E���Y�q�p���B�
$ 

How to understand this set of characters vs set of values?

Martin
  • 7,516
  • What version of whose tr are you using? In GNU coreutils v8.20 -c and -C are literally synonyms and both of your examples behave as example 1 does. – msw May 06 '13 at 11:32

1 Answers1

4

In a POSIX locale, characters can take values 0 to 127.

tr -dc 'A-Za-z0-9'

Would take the complement of those in values 0 to 255. While

tr -dC 'A-Za-z0-9'

Would take the complement of those in the set of valid characters (so values 0 to 128).

So the first one would be like:

tr -d '\0-\57\72-\100\133-\140\173-\377'

While the second would be like:

tr -d '\0-\57\72-\100\133-\140\173-\177'