Bash regular expression fails to compare correctly for $'\x01'
when End-Of-String char $
is used. All other byte values (seem to) compare correctly.
Using GNU bash 4.1.5(1). Is this a bug, or is there another way to represent bytes in hex notation, other than $'\...'
? ...But it doesn't seem to be the notation, because even a literal char to literal char comparison fails.
This 'fail' only happens when the $'\x01'
immediately precedes the End-Of-String $
.
Here are some examples:
echo 'non \x01 with ^ and $'
[[ 3 =~ ^$'\x33'$ ]]; echo $? # 0
[[ 3 =~ ^$'\063'$ ]]; echo $? # 0
[[ $'\x12' =~ ^$'\x12'$ ]]; echo $? # 0
[[ $'\002' =~ ^$'\x02'$ ]]; echo $? # 0
echo '\x01 with no ^ or $'
[[ $'\x01' =~ $'\x01' ]]; echo $? # 0
[[ $'\x01' =~ $'\001' ]]; echo $? # 0
[[ =~ $'\001' ]]; echo $? # 0 nb. Literal char does not render
[[ =~ ]]; echo $? # 0 nb. Literal char does not render
echo '\x01 with ^ only'
[[ $'\x01' =~ ^$'\x01' ]]; echo $? # 0
[[ $'\x01' =~ ^$'\001' ]]; echo $? # 0
[[ =~ ^$'\001' ]]; echo $? # 0 nb. Literal char does not render
[[ =~ ^ ]]; echo $? # 0 nb. Literal char does not render
echo '\x01 with ^ and $'
[[ $'\x01' =~ ^$'\x01'$ ]]; echo $? # 1
[[ $'\x01' =~ ^$'\001'$ ]]; echo $? # 1
[[ =~ ^$'\001'$ ]]; echo $? # 1 nb. Literal char does not render
[[ =~ ^$ ]]; echo $? # 1 nb. Literal char does not render
echo '\x01 with $ only'
[[ $'\x01' =~ $'\x01'$ ]]; echo $? # 1
[[ $'\x01' =~ $'\001'$ ]]; echo $? # 1
[[ =~ $'\001'$ ]]; echo $? # 1 nb. Literal char does not render
[[ =~ $ ]]; echo $? # 1 nb. Literal char does not render
echo '\x01 with $ only, but not adjacent to \x01'
[[ $'\x01'c =~ $'\x01'c$ ]]; echo $? # 0
[[ $'\x01'c =~ $'\001'c$ ]]; echo $? # 0
[[ c =~ $'\001'c$ ]]; echo $? # 0 nb. Literal char does not render
[[ c =~ c$ ]]; echo $? # 0 nb. Literal char does not render
cat -v
? It apparently uses the literal character^A
which disappears on StackOverflow. When I replaced the empty places with ^A, I got 0's everywhere (4.3.33(1)). – choroba Apr 08 '15 at 16:35GNU bash, version 4.3.30(1)
– cuonglm Apr 08 '15 at 16:35man ascii
calls\001
start of heading - I wonder if that is relevant? Like - maybebash
is interpreting\001\000
to mean no heading or something? I dunno howbash
encodes that stuff, but also there is a very interesting discussion on the gmane austin group lists about how different locales affect sort and collation orders - it's just that I can't pull it up becausegmane
has been pretty broken lately. – mikeserv Apr 08 '15 at 16:38\001
should never be a multibyte char) - but it is about the general subject anyway. – mikeserv Apr 08 '15 at 16:43$'\x02'
instead of my old favourite temp char$'\x01'
– Peter.O Apr 08 '15 at 17:27