On a GNU system, to substitute one character (other than newline) at random, you could do:
file=myfile.txt
offset=$(grep -bo . < "$file" | cut -d: -f1 | shuf -n1)
[ -z "$offset" ] || # file doesn't have non-newline characters
printf c | dd bs=1 seek="$offset" of="$file" conv=notrunc status=none
(with old versions of GNU dd
(prior to 8.20), replace status=none
with 2> /dev/null
).
grep -bo . < "$file"
would give you the offset in number of bytes in the file of each non-newline character. For instance, with a file encoded in UTF-8 that contains:
$3
£1
€2
That gives us:
$ grep -bo . < "$file"
0:$
1:3
3:£
5:1
7:€
10:2
With cut -d: -f1
, we retain the part before the first colon. Then, we pick one of those offsets at random with shuf -n1
.
That assumes the replacement character has the same size as the replaced one. For instance, replacing that £ above (2 bytes) with c (1 byte) would leave the file with c
followed by an invalid character.
To work around that, we can't overwrite the file in-place anymore as we'd need to shift data around.
We'd need something like:
perl -C -0777 -pi -e "substr \$_, $offset, 1, 'c'" -- "$file"
instead. With -C
, perl
honours the locale for what constitutes a character. -0777 -p
turns on the slurp mode where the content of $file
is slurped into $_
(see Security implications of running perl -ne '…' * though for security considerations with that construct). -pi
gives you in-place editing, $_
is written back to the file after the code is run. Then we call substr
to substitute the 1 character at the given offset with c
.
$rand1
or$tot_len
variable. Run that withset -x
and quote your variables. The sed error mentionschar 25
but that expression has fewer than 25 characters. – Stéphane Chazelas Aug 11 '15 at 12:29$rand1
is too large for sed. – user3891532 Aug 11 '15 at 12:33