Output of process substitution in bash duplicates ^A characters

Question

I recently wrote a script where I wanted to modify a file with sed before passing it as a parameter to another command:

$ some-command <(sed $'s\x01foo\x01bar\x01g' some-file)

This failed with the error:

sed: -e expression #1, char 8: unknown option to `s'

After some experimentation, I found that bash was duplicating the ^A (\x01) character before calling sed:

$ cat -v <(echo $'\x01')
^A^A

This does not happen with ^B (or other) characters.

$ cat -v <(echo $'\x02')
^B

Where is this behaviour documented? Is it a result of some default setting where ^A is used for obscure functionality?

I'm seeing this in four different versions of bash that I have access to: 4.1.2, 4.2.25, 4.2.46 (linux) and 4.3.42 (cygwin)

i found this answer about a similar question https://unix.stackexchange.com/questions/195081/bash-regular-expression-comparison-fail-for-hex-byte-x01 hope it helps. — D'Arcy Nader, Jan 08 '18 at 20:59
That's one of several bugs that involved the 0x1 character in bash (used internally to encode special data). That one was fixed in 4.4. — Stéphane Chazelas, Jan 08 '18 at 21:03
@isaac, OK sorry, I was testing with an earlier build of the development version not 4.4. The point is that it has been reported and fixed already — Stéphane Chazelas, Jan 08 '18 at 21:56
@StéphaneChazelas: Your first comment is the answer I would accept. — Adrian Pronk, Jan 08 '18 at 22:27

score 2 · Answer 1 · answered Jan 08 '18 at 21:27

Yes, the \x01 is duplicated when used inside parameter expansion:

$ cat -v <(echo $'\x01')
^A^A

And it happens in versions of bash after 2.05:

$ ./script
zsh/sh          : ^A
b203sh          : ^A
b204sh          : ^A
b205sh          : ^A
b30sh           : ^A^A
b32sh           : ^A^A
b41sh           : ^A^A
b42sh           : ^A^A
b43sh           : ^A^A
b44sh           : ^A^A
ksh93           : ^A
attsh           : ^A
zsh/ksh         : ^A
zsh             : ^A

That doesn't happen in a pipe:

$ echo $'\x01' | cat -v
^A

Workaround:

So, maybe you can re-write your code to:

$ echo $'\x01' | some-command

Or:

$ some-command <(sed $'s\x02foo\x02bar\x02g' some-file)

I ended up just using / as the delimiter and ensuring that it would be escaped correctly in the pattern and replacement strings — Adrian Pronk, Jan 08 '18 at 21:57

ilkkachu · Accepted Answer · 2018-01-08T22:26:11.407

1

This has been reported as a bug last February and again last September. There's a note about a fix in the Bash git tree in the latter discussion.

It happens with both ^A/\001 and with DEL/^?/\177, but seems to require $'...' inside the process substitution, so you could work around it by using "$(printf "...")" instead:

Not ok:

$ od -c  <( echo -n  $'\x01_\x7f' ) 
0000000 001 001   _ 001 177
0000005

Ok:

$ od -c  <( echo -n  "$(printf '\x01_\x7f')" )
0000000 001   _ 177
0000003

edited Jan 08 '18 at 22:26

answered Jan 08 '18 at 22:02

ilkkachu

138,973

You may omit the echo, just do od -c <( printf '\x01_\x7f' ). – Jan 08 '18 at 22:38
@isaac, well, the echo is there just for demonstration. The point is that command substitution + printf works as a replacement. – ilkkachu Jan 08 '18 at 22:44
Interesting that DEL also causes this behaviour. I didn't bother to check all the possibilities. It looks like \1 is being used to escape \? internally for some reason and the escapes are leaking out of the abstraction. – Adrian Pronk Jan 08 '18 at 22:57
@AdrianPronk, yep, something like that. There was another one regarding $* some months ago: https://lists.gnu.org/archive/html/bug-bash/2017-11/msg00107.html – ilkkachu Jan 08 '18 at 23:40

Output of process substitution in bash duplicates ^A characters

2 Answers2

Workaround: