1

I'm learning about shell scripting and I have some difficulties understanding the development of the main command structures needed to write a script, especially with the while,do and done commands.

I know that when redirecting the standard input of the while instruction from a file, it will stop once the whole file has been read (in this case, the file is read line by line and the current line is placed in a variable used by read):

while read lig fich
do
...
done < fich

does it mean that we have an entire curent line in fich?

For instance: this script take a username and a string as arguments, it searches for files gives owned by the username and whose name contains the string:

#usage nblign nom-utilisateur chaine
find . -user $1 | grep $2 >temp
while read lig temp
do
echo $lig "nombre de ligne" `wc -l < $lig`$
done < temp
rm temp

Here, my teacher ommited $ before lig in while is it a typo? Because in order to have the result of a command or a variable, one must use $ to retrieve it.

terdon
  • 242,166

2 Answers2

3

You have a few misconceptions. First of all, the format of the while read ... do you are trying to use is:

while read var; do ...; done < file

And not

while read var file; do ...; done < file

Basically, while read var; do ...; done < file will read each line of file and save it as var. Anything between read and do is taken as a variable. If you give more than one variable, then the line will be split on whitespace (well, on the value of the $IFS variable which is \t,\n and space, by default) and saved into the variables given. As explained in help read:

Reads a single line from the standard input, or from file descriptor FD if the -u option is supplied. The line is split into fields as with word splitting, and the first word is assigned to the first NAME, the second word to the second NAME, and so on, with any leftover words assigned to the last NAME.

So, for example:

$ echo "foo bar baz zab" | while read v1 rest; do echo "v1:$v1, rest:$rest"; done
v1:foo, rest:bar baz zab
$ echo "foo bar baz zab" | while read v1 v2 rest; do echo "v1:$v1, v2:$v2, rest:$rest"; done
v1:foo, v2:bar, rest:baz zab
$ echo "foo bar baz zab" | while read v1 v2 v3 rest; do echo "v1:$v1, v2:$v2, v3:$v3, rest:$rest"; done
v1:foo, v2:bar, v3:baz, rest:zab

As you can see above, the input line is split into as many variables as you give. When there are fewer variable names than "words" in the input, the last variable gets the rest of the line. This is exactly the same when reading from a file.

Then, variables are set using var="foo" and are read using $var. So no, your teacher was right, you don't want the $ when the variable is being defined. Therefore while read var is correct and while read $var is wrong.

So, a working version of your script, using the same logic, would be:

find . -user $1 | grep $2 >temp
while read lig 
do
    echo $lig "nombre de ligne" `wc -l < $lig`
done < temp
rm temp

Note that I removed the temp from the read and the $ from the end of the echo line. I have no idea why you put that there.

A better version of your script, with your variables correctly quoted, using find to find the relevant files instead of attempting to parse and without needless temp files would be:

find . -user "$1" -name "*$2*" |
## No need for a temp file, just pipe the output directly
## to the while loop
while read lig 
do
    echo "$lig nombre de ligne: $(wc -l < "$lig")"
done 

Finally, a truly robust approach which, unlike the above, can deal with arbitrary file names, including those with whitespace or other strange characters:

find . -user "$1" -name "*$2*" -print0 |
## No need for a temp file, just pipe the output directly
## to the while loop
while IFS= read -r -d '' lig 
do
    echo "$lig nombre de ligne: $(wc -l < "$lig")"
done 
terdon
  • 242,166
  • the input is split on $IFS into as many variables as there are fields separated by $IFS. $IFS whitespace and $IFS otherwise behave differently in that $IFS whitespace separates fields by contiguous sequence and can never generate null-length fields, but any other $IFS chars account for a field separator per - and contiguous sequences result in null-length vars assignments. Also backslashes are interpreted as escapes by default - even to escape newlines into nothing, for example - and leading/trailing $IFS chars are stripped entirely. – mikeserv Nov 10 '15 at 11:36
  • @blissini what do you mean? How fast? The question was closed while I was writing the answer. Since it was, in my opinion, wrongfully closed (I can understand exactly what the OP wanted and have edited to make it clear), I reopened it. – terdon Nov 10 '15 at 11:38
  • @mikeserv Yes, that's what I said. Well, except for the backslashes, I think the OP is confused enough already without going into that much detail. – terdon Nov 10 '15 at 11:39
  • it isn't very robust at all if the shell you use or the find you use doesn't support the non-standard options you specify. in those cases it will get about as far as ERROR. truly robust without the needless pipe construct is find . ... -exec sh -c 'for f do printf %s:\ "$f nombre de ligne"; wc -l <"$f"; done' ligecho {} + where the null delimiting is handled in the standard way by argument in the kernel. – mikeserv Nov 10 '15 at 11:46
  • @mikeserv true but not remotely relevant to a question about how while works. And the only non-standard option is -print0 which is supported by the vast majority of find implementations including GNU, BSD and even busybox. The only one I could find that doesn't seem to support it is AIX and I very much doubt the OP is using AIX. – terdon Nov 10 '15 at 12:02
  • i could name a few more that dont support it if you like, but theres also the read -d "" which is not a standard option. its the other end of the non-standard -print0. so you compound them, and wind up with a slower and less reliable method than the standard one. and anyway, -print0 and read -d dont have anything to do with while either. and your answer doesnt address while. while works by repeating the same compound command until it returns other than 0. read returns other than 0 when it reads an EOF. thats how it works. thats why i piped up in the first place, really. – mikeserv Nov 10 '15 at 12:10
1

From help read: "Read a line from the standard input and split it into fields.". In other words, it reads an entire line, and tries to break it up into 3 different parts and store each of those parts into the 3 variables you named. Note that if there is only one part on the line, the the other two variables will not be set. You don't use the $ there because you do not want to pass read the current value of the variable, but rather then name of the variable so that it knows it should set its value.

psusi
  • 17,303
  • Okay, so three variables are stored in temp: nblign nom-utilisateur chaine. If I had done while read $lig then I would not have read lig but I would have passed a value, which is not what I want. but I use echo $lig because here I want to print the value in lig which is a number of line given by the forced execution wc -l < $lig. Am I right? I don't understand done < temp then. – Revolucion for Monica Nov 10 '15 at 09:53
  • @Marine1, yes... the < temp simply redirects stdin to read from a file named "temp". – psusi Nov 10 '15 at 23:12