-1

In bash coding, line3 is a path taken from xyz/symlinks_paths.txt.

while read -r line3
do
    if [[ $(ls -lh $line3 | grep zzz.exe | grep '[8-9][0-9][0-9][MG]') -ne 0 ]] 
    then 
        echo $line3 >> xyz/size_list.txt
        exit 1
    fi
done < xyz/symlinks_paths.txt

The script throws me the following error. (h.sh is the script name.)

h.sh: line 20: [[: -r--r--r-- 1 syncmgr GX 838M Dec  1 21:55 zzz.txt: syntax error in expression (error token is "r--r-- 1 syncmgr GX 838M Dec  1 21:55 zzz.txt")
apaderno
  • 825
Pompy
  • 3
  • Does your line3 by any chance come from having done an ls on a directory? Is the overall purpose to get files that are larger than 800M? If so, find dir -type f -size +800M does that (in dir). No need to parse ls or call grep or loop, or explicitly test variables or anything. – Kusalananda Feb 20 '18 at 06:29
  • BTW, why the exit 1? do you only want to print the name of the first file in xyz/symlinks_paths.txt that's greater than 800MB? – cas Feb 20 '18 at 07:00
  • no @cas, i want to name all the files more than 800M. What should i do for that? Thats what i wonder why am i getting only 1 result. Can you help me on bypassing the loop, i dont have an else condition, so what should i write? – Pompy Feb 20 '18 at 09:22
  • uhhhh....don't exit 1 inside the loop. i.e. you're telling the script to exit after printing the first match. if you don't want it to do that, then don't tell it to. BTW, you don't need an else condition. Another way of looking at that is "the default else condition is to do nothing". – cas Feb 20 '18 at 09:23
  • that means i will remove the exit 1 part and i dont want to write the else part. Just if [ true ] then ... fi ? will if work without else? i think yes. I will experiment. Thanks. :) – Pompy Feb 20 '18 at 09:28
  • yep. i'll edit my answer to show it. – cas Feb 20 '18 at 09:28
  • @Kusalananda using find was my first thought too - but the files to examine are listed in a file called xyz/symlinks_paths.txt. Admittedly, if the only thing that file is used for is this while loop then the OP could just use find as you suggest. Also, I suspect that the list is a list of symlinks, not regular files, so use -type l rather than -type f. – cas Feb 20 '18 at 09:32
  • @cas I was thinking in the lines of bypassing that text file altogether. – Kusalananda Feb 20 '18 at 09:34

2 Answers2

3

The problem here is that you're trying to parse the output of ls. This is always a bad idea. See Why *not* parse `ls`? for an explanation of why that is the case.

If you want the size of a file, then use stat. e.g.

minsize=$(( 800 * 1024 * 1024 ))

# alternatively, if you have `numfmt` from GNU coreutils, delete the line above
# and uncomment the following line:
#minsize=$(echo 800M | numfmt --from=iec)

while read -r line3 ; do
  if [ "$(stat -L --printf '%s' "$line3")" -gt "$minsize" ]; then
    echo "$line3" >>  xyz/size_list.txt
  fi
done < xyz/symlinks_paths.txt

Note: I've used stat's -L (aka --dereference) option above because the input filename implies that the filenames listed in it might be symbolic links. Without -L, stat won't follow a symlink, it would print the size of the symlink itself.


If you want the file size printed to the output file along with the filename, then the while loop would be more like the following:

while read -r line3 ; do
  fsize=$(stat -L --printf '%s' "$line3")

  if [ "$fsize" -gt "$minsize" ]; then
    fsize=$(echo "$fsize" | numfmt --to=iec)
    printf "%s\t%s\n" "$fsize" "$line3" >>  xyz/size_list.txt
  fi
done < xyz/symlinks_paths.txt
cas
  • 78,579
  • i want size of file greater than 800MB. – Pompy Feb 20 '18 at 06:01
  • that's easy. see updated answer. – cas Feb 20 '18 at 06:01
  • no, i want in human readable format, so thta it appears in M and G. and i want to filter size like that. thats why i used grep '[8-9][0-9][0-9][MG]' – Pompy Feb 20 '18 at 06:03
  • that's also why what you tried didn't work. You're not even outputting the file size to your size_list.txt so why do you even care how it is compared? If you really want to specify the size in human-readable format, you can use numfmt to convert to a size in bytes. – cas Feb 20 '18 at 06:04
  • you could use du -h ${line3} | cut -f1 to get the human-readable size easily, if there's some particular reason stat is undesirable. – quixotic Feb 20 '18 at 06:11
  • 1
    @quixotic what? braces don't protect against spaces in filenames. double-quoting does that. braces are used to clearly distinguish a variable name from other text that the shell would otherwise interpret as being part of the variable name. e.g. echo $foobar is not the same as echo ${foo}bar (the former prints out variable $foobar. the latter prints variable $foo followed by literal text bar. – cas Feb 20 '18 at 06:11
  • @quixotic BTW sh can't easily do comparisons with human-readable formats like that. The shell's comparison operators like -gt, -ge, -lt, -le etc work on numbers only, specifically integers. To do such comparisons, you have to convert the human-readable numbers to integers. – cas Feb 20 '18 at 06:20
  • @cas, word splitting and filename expansion aren't performed inside [[ .. ]], so the double-quotes wouldn't really do anything. Actually, I think [[ x -ne y ]] takes x and y as arithmetic expressions. – ilkkachu Feb 20 '18 at 06:27
  • 1
    @quixotic, no, they don't. – ilkkachu Feb 20 '18 at 06:32
  • @quixotic, all that the braces do in parameter expansion is allow you to stick word characters after the variable name: $foobar vs ${foo}bar. And of course open the door for the expansions that manipulate the value. – ilkkachu Feb 20 '18 at 06:36
  • 1
    @ilkkachu true, i keep forgetting that about [[. Probably because I only ever use [[ .. ]] when I want to do a regexp match without forking sed or grep. TBH I find the fact that quoting isn't required in them a little bit squick-worthy...it feels "dirty" to not quote my variables (and not in a good way). – cas Feb 20 '18 at 06:36
  • 1
    @ilkkachu btw, you're also right about "[[ x -ne y ]] takes x and y as arithmetic expressions" but "Don't parse ls" is better advice than "you're parsing ls wrong". – cas Feb 20 '18 at 06:41
  • 1
1

This can be done with find (and xargs), but it won’t win any beauty contests.

Write a script called check_files:

#!/bin/sh
find "$@" -size +800M –print

Then run

xargs -d '\n' < xyz/symlinks_paths.txt ./check_files

where

  • You can move < xyz/symlinks_paths.txt redirection to the end of the command line, as in xargs -d '\n' ./check_files < xyz/symlinks_paths.txt, or to the beginning, or anywhere else.  Or you can replace it with -a xyz/symlinks_paths.txt.  Any of these mean that xargs will read from xyz/symlinks_paths.txt.
  • You can replace ./check_files with an absolute pathname to your check_files script.

-d '\n' means use newline as the delimiter when reading xyz/symlinks_paths.txt.  You can probably leave this off if your filenames don’t contain whitespace (space(s) or tab(s)), quotes (remember that a single quote (') is the same character as an apostrophe) or backslashes, and you’re willing to wager a year’s salary that they never ever will.

This reads each line of the file and makes it an argument to the check_files script, which passes them to find as starting-point arguments.  Many people know that you can run find with multiple starting-point arguments; e.g.,

find dir1 dir2 dir3  search-expression

It’s not so well known that those arguments don’t have to be directories; they can be files; e.g.,

find file1 file2 file3  search-expression

(or a mixture of directories and files).  find will simply apply the expression to each file named as a starting-point.

So this checks each file whose name is listed in xyz/symlinks_paths.txt to see whether its size is 800M or more, and prints those that are.

If the filenames might refer to symbolic links (as the xyz/symlinks_paths.txt name suggests) and you want to look at the pointed-to files (which you surely do), change find to find -L.

You don’t need to have a separate check_files script; you can do

xargs -d '\n' < paths.txt sh -c 'find "$@" -size +800c -print' sh

Again, change find to find -L if desired.

  • xargs was brain-damaged 40 years ago, when treated its input like the shell command line (words are separated by spaces, unless quoted or escaped), and it’s not much better now. I couldn’t figure out how to get it to treat each line of input as a separate argument to the program to be run, with multiple arguments per invocation, with other arguments following. -I {} seems to force -L 1 (one line → one invocation), and -I {} -L 999 ignores the -I {} and just appends arguments to the command line.  What am I missing? – G-Man Says 'Reinstate Monica' Feb 20 '18 at 23:09
  • FYI you've written file -L twice when you obviously meant find -L. also, re: what are you missing with xargs? Nothing. That's how -I works. freebsd's xargs has a -J option that doesn't force -L 1 or -n 1. GNU's xargs hasn't copied it ("yet", I hope). – cas Feb 21 '18 at 13:38
  • And the only thing good about using xargs here is that at least it's not a shell while read loop :-) – cas Feb 21 '18 at 13:40
  • @cas: Thanks for catching the typo.  The first time, it was my cat walking over the keyboard, and the second time I copied&pasted the first one.    :-)    ⁠ – G-Man Says 'Reinstate Monica' Feb 21 '18 at 20:20