For example, we have the content
001
002
004
008
010
in a text file named file
, how to extract the missing 3 5 6 7 9
?
For example, we have the content
001
002
004
008
010
in a text file named file
, how to extract the missing 3 5 6 7 9
?
My approach is to have control over the size of your numbers for that I would initialize two variable: starting and ending limit and append starting limit to the file name, Loop indefinite, compare start end limit and exit if starting number is greater than ending number, check if file exists and increment start limit.
StartNumber=$1
EndNumber=$2
while true; do
[ ${StartNumber} -gt ${EndNumber} ] && { exit 0 ; }
if [ ! -f ${FileName}_${StartNumber} ]; then
echo ${StartNumber}
fi
((StartNumber+=1))
done
Couple of suggestions from your comments:
find . -type f
and loop thru the results.echo ${filename} | tr -dc 0-9
to get the numbers only.$FileName
. it varies. The filename contains timestamp when saving the file so it could be different.
– wsdzbm
Mar 12 '16 at 16:13
An awk
way:
$ awk 'NR != $1 { for (i = prev + 1; i < $1; i++) {print i} } { prev = $1 + 1 }' file
3
5
6
7
9
More clearly:
awk 'NR != $1 {
for (i = prev + 1; i < $1; i++) {
print i
}
}
{
prev = $1
}'
For each line, I check if the line number matches the number, and if not, prints every number between the previous number (prev
) and the current number (exclusive, hence i = prev + 1
).
Assuming your example file is used, the following command
join -a 1 -o 1.1 2.1 -e missed <(seq -f '%03g' $(tail -1 <(sort file))) file | grep missed
will produce this output
003 missed
005 missed
006 missed
007 missed
009 missed
if that's what you need, i can provide some explanations
comm -23 <(printf '%03d\n' {1..10}) file
– iruvar Mar 12 '16 at 18:342
rather than002
,12
rather than012
– wsdzbm Mar 14 '16 at 12:47comm -23 <(printf '%03d\n' {1..10}) file | awk '{print +$0}'
– iruvar Mar 14 '16 at 13:07