Writing a character N times using the printf command

Question

I found the following command to repeat a character in Linux:

printf 'H%.0s' {1..5000} > H.txt

I want, for example, H to repeat 5000 times. What does %.0s mean here?

With tcsh or zsh, repeat 5000 printf H is easier to understand. With perl: print "H" x 5000 (note that that {1..5000} is a zsh operator inspired by perl's 1..5000 one and later copied by ksh93 and bash) — Stéphane Chazelas, Mar 06 '15 at 21:13
yes it works, but uses a lot of resources for larger repeat, follow the suggestions by Stéphane Chazelas — Skaperen, Mar 07 '15 at 13:51

mikeserv · Answer 1 · 2015-03-07T19:00:12.587

That command depends on the shell generating 5000 arguments, and passing them to printf which then ignores them. While it may seem pretty quick - and is relative to some things - the shell must still generate all of those strings as args (and delimit them) and so on.

Besides the fact that the generated Hs can't be printed until the shell first iterates to 5000, that command also costs in memory all that it takes to store and delimit the numeric string arguments to printf plus the Hs. Just as simply you can do:

printf %05000s|tr \  H

...which generates a string of 5000 spaces - which, at least, are usually only a single byte per and cost nothing to delimit because they are not delimited. A few tests indicate that even for as few as 5000 bytes the cost of the fork and the pipe required for tr is worth it even in this case, and it almost always is when the numbers get higher.

I ran...

time bash -c 'printf H%.0s {1..5000}' >/dev/null

...and...

time bash -c 'printf %05000s|tr \  H' >/dev/null

Each about 5 times a piece (nothing scientific here - only anecdotal) and the brace expansion version averaged a little over .02 seconds in total processing time, but the tr version came in at around .012 seconds total on average - and the tr version beat it every time. I can't say I'm surprised - {brace expansion} is a useful interactive shell shorthand feature, but is usually a rather wasteful thing to do where any kind of scripting is concerned. The common form:

for i in {[num]..[num]}; do ...

...when you think about it, is really two for loops - the first is internal and implied in that the shell must loop in some way to generate those iterators before saving them all and iterating them again for your for loop. Such things are usually better done like:

iterator=$start
until [ "$((iterator+=interval))" -gt "$end" ]; do ...

...because you store only a very few values and overwrite them as you go as well as doing the iteration while you generate the iterables.

Anyway, like the space padding mentioned before, you can also use printf to zeropad an arbitrary number of digits, of course, like:

printf %05000d

I do both without arguments because for every argument specified in printf's format string when an argument is not found the null string is used - which is interpreted as a zero for a digit argument or an empty string for a string.

This is the other (and - in my opinion - more efficient) side of the coin when compared with the command in the question - while it is possible to get nothing from something as you do when you printf %.0 length strings for each argument, so also is it possible to get something from nothing.

Quicker still for large amounts of generated bytes you can use dd like:

printf \\0| dd bs=64k conv=sync

...and w/ regular files dd's seek=[num] argument can be used to greater advantage. You can get 64k newlines rather than nulls if you add ,unblock cbs=1 to the above and from there could inject arbitrary strings per line with paste and /dev/null - but in that case, if it is available to you, you might as well use:

yes 'output string forever'

Here are some more dd examples anyway:

dd bs=5000 seek=1 if=/dev/null of=./H.txt

...which creates (or truncates) a \0NUL filled file in the current directory named H.txt of size 5000 bytes. dd seeks straight to the offset and NUL-fills all behind it.

<&1 dd bs=5000 conv=sync,noerror count=1 | tr \\0 H >./H.txt

...which creates a file of same name and size but filled w/ H chars. It takes advantage of dd's spec'd behavior of writing out at least one full null-block in case of a read error when noerror and sync conversions are specified (and - without count= - would likely go on longer than you could want), and intentionally redirects a writeonly file descriptor at dd's stdin.

score 9 · Answer 2 · answered Mar 06 '15 at 20:49

The %.0s means to convert the argument as a string, with a precision of zero. According to man 3 printf, the precision value in such a case gives

   [ ... ] the  maximum  number  of characters to be printed from a
   string for s and S conversions.

hence when the precision is zero, the string argument is not printed at all. However the H (which is part of the format specifier) gets printed as many times as there are arguments, since according to the printf section of man bash

The format is reused as necessary to consume all  of  the  argu‐
ments.  If the format requires more arguments than are supplied,
the extra format specifications behave as if  a  zero  value  or
null  string,  as  appropriate,  had  been supplied.

score 8 · Answer 3 · edited Feb 27 '23 at 14:32

In this case, %.0s always prints one instance of character(s) preceding it, H in this case. When you use {1..5000}, the shell expands it and it becomes:

printf 'H%.0s' 1 2 3 4 ... 5000 > H.txt

i.e., the printf command now has 5000 arguments, and for each argument, you will get one H. These don't have to be sequential or numeric:

printf 'H%.0s' a bc fg 12 34

prints HHHHH -- i.e., the number of arguments, 5 in this case.

Note, the ellipses in the 1st example above aren't inserted literally, they're there to indicate a sequence or range.

Writing a character N times using the printf command

3 Answers3

Linked

Related