I was curious to see how some of these (+ some alternatives) work speed-wise with a rather large file (163MiB
, one IP
per line, ~ 13 million lines):
wc -l < iplist
13144256
Results (with sync; echo 3 > /proc/sys/vm/drop_caches
after each command; I repeated the tests - in reverse order - after a couple of hours but the differences were negligible; also note that I am using gnu sed
):
steeldriver:
Very slow. Aborted after two minutes of waiting... so no result for this one.
cuonglm:
awk 'FNR!=1{print l}{l=$0};END{ORS="";print l}' ORS=' | ' iplist
real 0m3.672s
perl -pe 's/\n/ | / unless eof' iplist
real 0m12.444s
mikeserv:
paste -d\ /dev/null iplist /dev/null | paste -sd\| -
real 0m0.983s
jthill:
sed 'H;1h;$!d;x;s/\n/ | /g' iplist
real 0m4.903s
Avinash Raj:
time python2.7 -c'
import sys
with open(sys.argv[1]) as f:
print " | ".join(line.strip() for line in f)' iplist
real 0m3.434s
and
val0x00ff:
while read -r ip; do printf '%s | ' "$ip"; done < iplist
real 3m4.321s
which means 184.321s
. Unsurprisingly, this is 200 times slower than mikeserv's solution.
Here are some other ways with
awk:
awk '$1=$1' RS= OFS=' | ' iplist
real 0m4.543s
awk '{printf "%s%s",sep,$0,sep=" | "} END {print ""}' iplist
real 0m5.511s
perl:
perl -ple '$\=eof()?"\n":" | "' iplist
real 0m9.646s
xargs:
xargs <iplist printf ' | %s' | cut -c4-
real 0m6.326s
a combination of head+paste+tr+cat:
{ head -n -1 | paste -d' |' - /dev/null /dev/null | tr \\n \ ; cat ; } <iplist
real 0m0.991s
If you have GNU coreutils
and if your list of IPs isn't really huge (let's say up to 50000 IPs) you could also do this with pr
:
pr -$(wc -l infile) -tJS' | ' -W1000000 infile >outfile
where
-$(wc -l infile) # no. of columns (= with no. of lines in your file)
-t # omit page headers and trailers
-J # merge lines
-S' | ' # separate columns by STRING
-W1000000 # set page width
e.g. for a 6-lines file:
134.28.128.0
111.245.28.0
109.245.24.0
128.27.88.0
122.245.48.0
103.44.204.0
the command:
pr -$(wc -l <infile) -tJS' | ' -W1000 infile
outputs:
134.28.128.0 | 111.245.28.0 | 109.245.24.0 | 128.27.88.0 | 122.245.48.0 | 103.44.204.0
tr
anslate newlines into|
pipes? Like<ipfile tr \\n \| >outfile
? – mikeserv Apr 01 '15 at 17:26|
required? – cuonglm Apr 01 '15 at 17:28<
. So<mydoc tr \\n \| >mydoc2
. But that won't get you the spaces. For those, probably the quickest solution ispaste -d' | ' mydoc /dev/null /dev/null >mydoc2
– mikeserv Apr 01 '15 at 17:55paste
won't work in this case. – cuonglm Apr 01 '15 at 18:15paste
writes lines corresponding from each file. Without-s
, you will get back number of lines you have in file. – cuonglm Apr 01 '15 at 18:27column ip | tr -s '\t' '|'
produces134.27.128.0|111.245.48.0|109.21.244.0
. – Ivan Chau Apr 02 '15 at 07:50while read -r ip; do printf '%s | ' "$ip"; done < file
– Valentin Bajrami Apr 02 '15 at 10:43