Compare files and select bigger one

Question

There are two directories with many files. Those files are always matching in names and not always matching in size. For example:

/dir1
|-file1 (1 MB)
|-file2 (2 MB)
|-file3 (3 MB)

/dir2
|-file1 (1 KB)
|-file2 (2 MB)
|-file3 (10 MB)

As you see, filenames are match but filesize matches only in file2. How can I compare files in those 2 directories and select files only which are bigger? Output in example case must be "/dir2/file3".

If there is a file in dir1 that is bigger than the file with the same name in dir2 = then do nothing. I am interested only about files in dir2 that are bigger than ones in dir1

I've wrote a script, but it works only if one bigger file in dir2 was found.

#!/bin/bash
diff -q $1 $2 | awk '{ print $2,$4 }' > tempfile.txt
A=`cat tempfile.txt | cut -d ' ' -f 1`
B=`ls -s $A | cut -d ' ' -f 1`
C=`cat tempfile.txt | cut -d ' ' -f 2`
D=`ls -s $C | cut -d ' ' -f 1`
if [ "$D" -gt "$B" ]; then
 echo $C
fi

"...but filesize matches only in file2" I'm not sure I understand this. What do you want to do if there is a file in dir1 that is bigger than the file with the same name in dir2? — Niko Gambt, Jan 23 '19 at 03:46
Hello, Niko, thnx for reply. "if there is a file in dir1 that is bigger than the file with the same name in dir2" = then do nothing. I am interested only about files in dir2 that are bigger than ones in dir1 — Dmitrii Medvedev, Jan 23 '19 at 06:40
Are the files with the same size identical? If yes, you might be able to use diff -q dir1/ dir2/ — Panki, Jan 23 '19 at 07:41
Panki, yes, names are same. But diff -q shows "all different" files while I need to find "bigger different" (NOT different files which are smaler) — Dmitrii Medvedev, Jan 23 '19 at 07:48

Kusalananda · Answer 1 · 2019-01-23T13:58:29.000

#!/usr/bin/env zsh

zmodload -F zsh/stat b:zstat

for file2 in dir2/*(.); do
    file1="dir1/${file2##*/}"

    if [ -f "$file1" ] &&
       [ "$( zstat +size "$file2" )" -gt "$( zstat +size "$file1" )" ]
    then
        printf '%s is bigger than %s\n' "$file2" "$file1"
    fi
done

This is a zsh shell script that uses the built-in command zstat to portably get the file sizes.

The script will loop over all regular files with non-hidden names in the dir2 directory. For each file in dir2 it will construct the corresponding pathname for a file in dir1. If the file in dir1 exists and is a regular file (or a symbolic link to a regular file), the size of the two files are compared. If the file in dir2 is strictly bigger, a short message is outputted.

The pattern dir2/*(.) will match only non-hidden names of regular files in the dir2 directory. The (.) is a zsh-specific modifier for * that makes it match only regular files.

The expression "dir1/${file2##*/}" will expand to a pathname starting with dir1/ and then containing the value of $file2 with everything before and including the last / removed. This could be changed to "dir1/$( basename "$file2" )".

Niko Gambt · Answer 2 · 2019-01-23T13:36:51.777

#!/bin/bash

get_attr() {
    # pass '%f' to $2 to get file name(s) or '%s' to get file size(s)
    find "$1" -maxdepth 1 -type f -printf "$2\n"
}

while read -r file
do
    (( $(get_attr "dir2/$file" '%s') > $(get_attr "dir1/$file" '%s') )) \
        && realpath -e "dir2/$file"
done < <(get_attr dir2 '%f')

This assumes all files in dir2 have the same names as files in dir1, as you describe above.

realpath prints the absolute path of the file.

This script also compares hidden files (files that begin with a .).

Compare files and select bigger one

2 Answers2