5

How can I create a data file with one column in which there will be 1000 rows with zero values?

something like:

output:

0
0
0
0
0
.
.

.

zara
  • 1,313

5 Answers5

11

You might use yes(1) for that (piped into head(1)...):

yes 0 | head -n 1000 > data_file_with_a_thousand_0s.txt

and if you need a million zeros, replace the 1000 with 1000000

PS. In the old days, head -1000 was enough since equivalent to head -n 1000 today.

  • 1
    This definitely feels like "the Unix way" - glue some simple elements together into a pipe. It's worth noting though that the bare "-1000" argument is deprecated and the standard form (as listed on the man page you linked) is now "-n 1000". See e.g. http://www.unix.com/man-page/posix/1p/head/ and https://www.gnu.org/software/coreutils/manual/html_node/head-invocation for confirmation of this history. – IMSoP Aug 05 '16 at 15:17
  • Brilliant. Didn't know yes existed. – Tulains Córdova Aug 05 '16 at 18:57
8

Simply,

printf '0\n%.0s' {1..1000}

or using for loop,

for i in {1..1000}; do echo "0"; done

using awk,

awk 'BEGIN{for(c=0;c<1000;c++) print "0"}'

As @StéphaneChazelas pointed out, Using {1..1000} requires zsh or recent versions of bash, yash or ksh93 and also means storing the whole range in memory (possibly several times). You'll find it becomes a lot slower (if it doesn't crash for OOM) than using awk or yes 0 | head ... for large ranges like {1..10000000}. Or in other words it doesn't scale well. Possible workaround would be to use

for ((i=0; i<=10000000;i++)); do echo 0; done 

(ksh93/zsh/bash) wouldn't have the memory issue but would still be orders of magnitude slower than a dedicated tool or real programming language approach.

Rahul
  • 13,589
  • The printf command might fail when 1000 gets replaced with 1000000 because the shell would fail execve(2) with E2BIG – Basile Starynkevitch Aug 05 '16 at 05:45
  • 1
    Using {1..1000} requires zsh or recent versions of bash, yash or ksh93 and also means storing the whole range in memory (possibly several times). You'll find it becomes a lot slower (if it doesn't crash for OOM) than using awk or yes|head for large ranges like {1..10000000}. Or in other words it doesn't scale well. – Stéphane Chazelas Aug 05 '16 at 10:39
  • @BasileStarynkevitch, the shells that support {x..y} (zsh, ksh93, bash and yash) all have printf builtin, so the E2BIG doesn't apply. – Stéphane Chazelas Aug 05 '16 at 10:41
  • @StéphaneChazelas I am completely agree with you. Is there any other workaround in this case ? – Rahul Aug 05 '16 at 10:44
  • 1
    for ((i=0; i<=10000000;i++)); do echo 0; done (ksh93/zsh/bash) wouldn't have the memory issue but would still be orders of magnitude slower than a dedicated tool or real programming language approach. – Stéphane Chazelas Aug 05 '16 at 10:58
  • (1) Trivial note: for ((i=0; i<=10000000;i++)) iterates 1,000,001 times.  (2) There are little tricks that might optimize this a little: for ((i=0; i<400; i++)); do echo -e "0\n0\n0\n0\n0"; done > 2K; for ((i=0; i<125; i++); do echo "2K 2K 2K 2K"; done | xargs cat > million; rm 2K. – G-Man Says 'Reinstate Monica' Aug 05 '16 at 20:20
  • Note that you can drop the 0 from printf '0\n%.0s' {1..1000} to get printf '0\n%.s' {1..1000} and you get the same result. Does the exact same thing. – kirkpatt Aug 05 '16 at 21:37
6
perl -e 'print "0\n" x 1000' > file.txt


As @Stéphane Chazelas notes, this is fast for large numbers but can run into memory issues(use yes|head approach in that case)

performance comparison, selected best out of 3 continuous runs

$ time perl -e 'print "0\n" x 100000000' > /dev/null
real    0m0.117s

$ time python -c 'import sys; sys.stdout.write("0\n" * 100000000)' > /dev/null
real    0m0.184s

$ time yes 0 | head -n 100000000 > /dev/null
real    0m0.979s

$ time awk 'BEGIN{for(c=0;c<100000000;c++) print "0"}' > /dev/null
real    0m12.933s

$ time seq 0 0 0 | head -n 100000000 > /dev/null
real    0m19.040s
Sundeep
  • 12,008
2
python2 -c 'print "0\n" * 1000' > file.txt
heemayl
  • 56,300
  • that will print an extra blank line, you would have to use trailing comma (,) at the end of command to prevent this. – Rahul Aug 05 '16 at 09:56
  • 1
    This only works in the old python2, in Python 3 print no longer is a statement – Anthon Aug 05 '16 at 10:17
0

seq could be used:

seq 0 0 0 | head -1000
phk
  • 5,953
  • 7
  • 42
  • 71
aph
  • 34