1

How to find missing directories between two disk structures?

  • Lets say Disk A has dir A/, B/, C/, D/. (newer copy)
  • Lets say Disk B has dir A/, B/, D/, E/. (older copy)

I want to find which directories are missing from newer copy in compared to the older copy.

Results: "Dir E".

How I can do this? I dont want any report on files.

This can create only the missing directories from a specific target:

find -type d -exec mkdir -p "/mnt/pics/Albums/{}" \; 

UPDATE: The suggested article in my question was about content and not folders. Also, it was not presented a clear solution as the answer of "diff -rq path1 path2".

5 Answers5

11

diff allows comparing directory contents. Use -r for recursively traversing subdirectories and -q for reporting differences only:

diff -rq /path/to/dir1 /path/to/dir2

E.g.:

$mkdir A B
$touch A/1 A/2 A/3 B/1 B/3 B/4
$diff -rq A B
Only in A: 2
Only in B: 4

Note that this also compares files for identical contents!

FelixJN
  • 13,566
4

A rather comfortable graphical tool is meld. It is available in the repositories of many Linux distributions.

You can simply provide the directory paths you want to compare, as in

meld dir1/ dir2/

It will show a split tree view indicating which which files/directories are only present in one of the paths, and in case of files present in both paths, which of the two has a more recent timestamp.

AdminBee
  • 22,803
4

With bash, you can use:

diff <(cd dir1 && find -type d) <(cd dir2 && find -type d)

This will recursively list the directories in dir1 and dir2, in alphabetical order, and compare these lists with diff.

1

One tool you can use is comm, which is intended to "compare two sorted files line by line" (man comm). The two files can be the lists of directories.

Example,

# Set up the scenario
mkdir -p Disk_A/{A,B,C,D} Disk_B/{A,B,D,E}

Compare the directory lists (old, new)

LC_ALL=C comm -13 <( cd Disk_A && find -type d | LC_ALL=C sort ) <( cd Disk_B && find -type d | LC_ALL=C sort )

Result

./E

By varying combinations of comm's flags -1, -2, -3 that omit output columns, you can also list directories in A but not B (use -23) or directories that are common (use -12).

Much like the other answers this will fail for pathological directory names such as those containing newlines. Use this extended variant if these are a possibility:

LC_ALL=C comm -z13 <( cd Disk_A && find -type d -print0 | LC_ALL=C sort -z ) <( cd Disk_B && find -type d -print0 | LC_ALL=C sort -z ) | tr '\0' '\n'

If you can guarantee that there will be no byte sequences representing invalid Unicode characters you can omit all three instances of LC_ALL=C, which temporarily sets the locale to C (bytes).

Chris Davies
  • 116,213
  • 16
  • 160
  • 287
0

The following shell script takes two arguments: the old directory and the new. It lists only directories which exist in the first argument and not in the second. It is, however, a little more verbose than the other solutions.

#!/bin/bash
source_dir=$1
target_dir=$2

for folder_plus_path in $( find $source_dir -mindepth 1 -maxdepth 1 -type d ); do folder=${folder_plus_path#${source_dir}/} if [[ ! -d ${target_dir}/${folder} ]]; then echo ${folder} fi done

As it stands, the above will only list missing directories immediately contained in the first argument (so it would hightlight that A/E does not exist, but it doesn't check that B/A/Subdir also exists as A/A/Subdir). If you want to check these sub-directories also then remove the -maxdepth 1 from the script.

eff
  • 179