13

I have a text file containing one file name in each line:

111_c4l5r120.png
123_c4l4r60.png
135_c4l4r180.png
147_c4l3r60.png
15_c4l1r120.png
...

I want to convert it in this shape:

111_c4l5r120.png 111
123_c4l4r60.png 123
135_c4l4r180.png 135
147_c4l3r60.png 147
15_c4l1r120.png 15
...

using this code:

#!/bin/bash
while IFS='' read -r line || [[  -n "$line"  ]]; do
   echo "$line" >> output.txt   
   echo "$line" | cut -d'_' -f 1 >> output.txt
done < "$1"

but, the result is:

111_c4l5r120.png 
111
123_c4l4r60.png 
123
135_c4l4r180.png 
135
147_c4l3r60.png 
147
15_c4l1r120.png 
15
...

How should I change my script to have the desire output?

Stephen Kitt
  • 434,908
Ali
  • 265

3 Answers3

23

Don't do this sort of thing in the shell! It is far more complex than necessary, prone to errors and far, far, slower. There are many tools designed for such text manipulation. For example, in sed (here assuming recent GNU or BSD implementations for -E):

$ sed -E 's/([^_]*).*/& \1/' file
111_c4l5r120.png 111
123_c4l4r60.png 123
135_c4l4r180.png 135
147_c4l3r60.png 147
15_c4l1r120.png 15

Or, for any sed:

$ sed 's/\([^_]*\).*/& \1/' file
111_c4l5r120.png 111
123_c4l4r60.png 123
135_c4l4r180.png 135
147_c4l3r60.png 147
15_c4l1r120.png 15

Perl:

$ perl -pe 's/(.+?)_.*/$& $1/' file
111_c4l5r120.png 111
123_c4l4r60.png 123
135_c4l4r180.png 135
147_c4l3r60.png 147
15_c4l1r120.png 15

awk:

$ awk -F_ '{print $0,$1}' file
111_c4l5r120.png 111
123_c4l4r60.png 123
135_c4l4r180.png 135
147_c4l3r60.png 147
15_c4l1r120.png 15
terdon
  • 242,166
  • 1
    External utilities are not much better, though. – EKons Jun 01 '16 at 12:26
  • 6
    @ΈρικΚωνσταντόπουλος yes they are. Several orders of magnitude faster, actually. The shell is just not very good at this sort of thing. A shell's main job is launching external utilities, after all. Compare the time taken by the OP's approach to that taken by any of the solutions here. Shell loops are very, very slow. If you need more convincing, read this. – terdon Jun 01 '16 at 13:17
  • In terms of portability, nope. In terms of speed, yes. Also, is @StéphaneChazelas your alias? – EKons Jun 01 '16 at 13:21
  • 4
    @ΈρικΚωνσταντόπουλος Θα 'θελα :) No, he just happens to have written 2 great answers that were relevant to the two comment threads. As for portability, with the (minor) exception of the perl approach which will only work on something like ~90% of nix machines, all three solutions are portable and shell agnostic. Or, OK, you could always make the sed one into `sed 's/([^_]).*/& \1/' filefor extra portability. Point is, you can count onawkandsed` being there more than you can count on pretty much anything else. – terdon Jun 01 '16 at 13:24
17

Unless you have a specific need to use the shell for this, terdon's answer provides better alternatives.

Since you're using bash (as indicated in the script's shebang), you can use the -n option to echo:

echo -n "${line} " >> output.txt
echo "$line" | cut -d'_' -f 1 >> output.txt

Or you can use shell features to process the line without using cut:

echo "${line} ${line%%_*}" >> output.txt

(replacing both echo lines).

Alternatively, printf would do the trick too, works in any POSIX shell, and is generally better (see Why is printf better than echo? for details):

printf "%s " "${line}" >> output.txt
echo "$line" | cut -d'_' -f 1 >> output.txt

or

printf "%s %s\n" "${line}" "${line%%_*}" >> output.txt

(Strictly speaking, in plain /bin/sh, echo -n isn't portable. Since you're explicitly using bash it's OK here.)

Stephen Kitt
  • 434,908
2

Here you are:

#!/bin/bash

while IFS='' read -r line || [[  -n "$line"  ]]; do
   echo "$line" `echo "$line" | cut -d'_' -f 1` >> output.txt
#   echo "$line" | cut -d'_' -f 1 >> output.txt
done < "$1"

Output:

$ rm -rf output.txt
$ ./test.sh 1.1; cat output.txt
111_c4l5r120.png 111
123_c4l4r60.png 123
135_c4l4r180.png 135
147_c4l3r60.png 147
15_c4l1r120.png 15
Putnik
  • 886