7

I have thousands of files named 1.txt 2.txt and so on. Some of those files are missing. Which would be the easiest way to find out which files are missing?

  • Are those files all in the same directory? And how far will the seqence go? Is the naming convention always the same 1.txt for the first one 105.txt for the 105th one? – chaos Oct 14 '15 at 17:39

4 Answers4

6
ub=1000 # Replace this with the largest existing file's number.
seq "$ub" | while read -r i; do
    [[ -f "$i.txt" ]] || echo "$i.txt is missing"
done

You can easily find the proper value for ub by doing ls | sort -n or similar. This relies on the files being in the format output by seq, notably here without leading zeroes.

Tom Hunt
  • 10,056
5
$ ls
1.txt  3.txt
$ seq 1 10 | xargs -I {} ls {}.txt >/dev/null
ls: cannot access 2.txt: No such file or directory
ls: cannot access 4.txt: No such file or directory
ls: cannot access 5.txt: No such file or directory
ls: cannot access 6.txt: No such file or directory
ls: cannot access 7.txt: No such file or directory
ls: cannot access 8.txt: No such file or directory
ls: cannot access 9.txt: No such file or directory
ls: cannot access 10.txt: No such file or directory
$
steve
  • 21,892
2

This is the function I will be using

missing () {
  #ub gets the largest sequential number
  ub=$(ls | sort -n | tail -n 1 | xargs basename -s .txt)
  seq "$ub" | while read -r i; do
    [[ -f "$i.jpg" ]] || echo "$i.txt is missing"
  done
}
  • 1
    I think that the normal etiquette is that your accepting @TomHunt's answer already says that it is, to your mind, the best answer; and that you should not re-post that answer as your own (at the very least, not without attribution). – LSpice Oct 14 '15 at 22:47
0

Another (bash):

comm -23 <(printf '%d.txt\n' {1..1000} | sort) <(ls *.txt |sort)
iruvar
  • 16,725