9

I have 2000+ files in a folder, but there are few files missing from the folder.

Name of the files are like

GLDAS_NOAH025SUBP_3H.A2003001.0000.001.2015210044609.pss.grb GLDAS_NOAH025SUBP_3H.A2003001.0600.001.2015210044609.pss.grb GLDAS_NOAH025SUBP_3H.A2003001.1200.001.2015210044609.pss.grb GLDAS_NOAH025SUBP_3H.A2003001.1800.001.2015210044609.pss.grb GLDAS_NOAH025SUBP_3H.A2003002.0000.001.2015210044609.pss.grb GLDAS_NOAH025SUBP_3H.A2003002.0600.001.2015210044609.pss.grb GLDAS_NOAH025SUBP_3H.A2003002.1200.001.2015210044609.pss.grb GLDAS_NOAH025SUBP_3H.A2003002.1800.001.2015210044609.pss.grb GLDAS_NOAH025SUBP_3H.A2003003.0000.001.2015210044609.pss.grb GLDAS_NOAH025SUBP_3H.A2003003.0600.001.2015210044609.pss.grb GLDAS_NOAH025SUBP_3H.A2003003.1200.001.2015210044609.pss.grb GLDAS_NOAH025SUBP_3H.A2003003.1800.001.2015210044609.pss.grb

001 indicates day, while 0000 is the hour.

How to find out which file is missing in the folder? I got few answer in google but could not figure out how to implement those.

Maria
  • 91
  • You mean, every day from 001 to xyz there must be 4 files: 0000, 0600, 1200 and 1800? And if not that name should be printed? Do I understand this correct? – chaos Jul 29 '15 at 12:54
  • In *nix, we call them 'directories', not folders. Folders is a Windows term. – Rob Jul 29 '15 at 13:01
  • 1
    @chaos it is 001 to 365 & yes you got my point. – Maria Jul 29 '15 at 13:33
  • 3
    @Rob: meh. Both terms are pretty well understood. Actually, Apple probably came up with the folder metaphor around the time of the first release of MacOS before MS Windows even existed. And since as we all know MacOS is UNIX, that makes folder a UNIX term :-) – Celada Jul 29 '15 at 13:39
  • 2
    Folder and directory are very different metaphors, though. And directories in Unix behave very much like directories in real-life and unlike folders in real-life, whereas folders in Windows behave much more like real-life folders than real-life directories, so it makes sense to use the term which more closely resembles the corresponding real-life concept. I have personally seen data loss caused by a user thinking directories behaved like folders because his teachers kept calling them folders instead of directories. – Jörg W Mittag Jul 29 '15 at 16:10
  • @Celada While we all know what she meant, the terminology is incorrect. Open any man page and you will never find anything calling a directory anything but that. Get the terms right or you're just creating confusion for those who don't know the difference. – Rob Jul 29 '15 at 17:51
  • You may want to consider leap years. – MSalters Jul 30 '15 at 08:34
  • @Celada MacOS wasn't probably UNIX when it came up with the idea. – Abdullah Ibn Fulan Oct 13 '21 at 10:11

4 Answers4

14

With zsh or bash4, you can use brace expansion for that:

ls -d GLDAS_NOAH025SUBP_3H.A2003{001..006}.{0000,0600,1200,1800}.001.2015210044609.pss.grb >/dev/null

Notice the brackets:

  • {001..006} means expand to 001, 002, ... 006
  • {0000,0600,1200,1800} to every one of the above add 0000, 0600, 1200 and 1800.
  • >/dev/null is to avoid the standard output of ls -> we only want standard error

Now if one file is not present, ls will show an error for that:

ls: cannot access GLDAS_NOAH025SUBP_3H.A2003004.0000.001.2015210044609.pss.grb: No such file or directory
ls: cannot access GLDAS_NOAH025SUBP_3H.A2003004.0600.001.2015210044609.pss.grb: No such file or directory
ls: cannot access GLDAS_NOAH025SUBP_3H.A2003004.1200.001.2015210044609.pss.grb: No such file or directory
ls: cannot access GLDAS_NOAH025SUBP_3H.A2003004.1800.001.2015210044609.pss.grb: No such file or directory
ls: cannot access GLDAS_NOAH025SUBP_3H.A2003005.0000.001.2015210044609.pss.grb: No such file or directory
ls: cannot access GLDAS_NOAH025SUBP_3H.A2003005.0600.001.2015210044609.pss.grb: No such file or directory
ls: cannot access GLDAS_NOAH025SUBP_3H.A2003005.1200.001.2015210044609.pss.grb: No such file or directory
ls: cannot access GLDAS_NOAH025SUBP_3H.A2003005.1800.001.2015210044609.pss.grb: No such file or directory
ls: cannot access GLDAS_NOAH025SUBP_3H.A2003006.0000.001.2015210044609.pss.grb: No such file or directory
ls: cannot access GLDAS_NOAH025SUBP_3H.A2003006.0600.001.2015210044609.pss.grb: No such file or directory
ls: cannot access GLDAS_NOAH025SUBP_3H.A2003006.1200.001.2015210044609.pss.grb: No such file or directory
ls: cannot access GLDAS_NOAH025SUBP_3H.A2003006.1800.001.2015210044609.pss.grb: No such file or directory

With ksh93, replace {001..006} with {1..6%.3d}.

chaos
  • 48,171
7

A variation on @chaos solution (bash 4.0 or above or zsh 4.3.11 and above):

for a in GL.....2003{001..365}.{00..18..6}00.001.2015210044609.pss.grb 
do  
  [[ -f $a ]] || echo "$a"
done

or

for a in {001..365}.{00..18..6}
do
  [[ -f "GL.....2003${a}00.001.2015210044609.pss.grb" ]] || echo "$a"
done

to print only the missing day+hour

JJoao
  • 12,170
  • 1
  • 23
  • 45
4

While chaos's answer is good to be used in interactive shells, this one can be used as a POSIX script, for example if you need to do this periodically and/or do it on another computers.

#!/bin/sh
i=0
while test "$((i+=1))" -lt 366 ; do
    for j in 00 06 12 18 ; do
        file="GLDAS_NOAH025SUBP_3H.A2003$(printf '%03d' "$i").${j}00.001.2015210044609.pss.grb"
        test -e "$file" || echo "$file"
    done
done

(seq or brace expansion aren't specified by POSIX.)

MichalH
  • 2,379
2

Build the file names in a loop and then test for non-existence of a file:

for day in `seq -f "%03g" 1 30`
  do
  for hour in 0000 0600 1200 1800
    do
    filename="GLDAS_NOAH025SUBP_3H.A2003${day}.${hour}.001.2015210044609.pss.grb"
    if [[ ! -e $filename ]]
    then
      echo "File missing: $filename"
    fi
  done
done

Note: I do not guarantee this example to be error-free. It is an example, not a known-working script.

Portability: needs ksh, bash or zsh and a system with the GNU seq command available.

John
  • 17,011
  • The concept is the one I converged on as well, but please note that very few Julian months have 30 days. – WAF Jul 29 '15 at 13:20