10

I have a large "myfiles" directory full of miscellaneous documents and do not want to modify its structure.

I therefore created (several) other directories for each class of documents. For example, I have an "images" directory which has symlinks to each .jpg or .cr2 file in the "myfiles" directory plus other descriptive files for each symlink (with the same filename) with description and other metadata. The symlinks in the /images directory might have a different name from the original linked file.

I am trying to find the simplest way to make sure each and every image file in the "myfiles" directory has a symlink into the "images" directory.

See an example of the folder structure

/myfiles/a.doc
/myfiles/b.jpg
/myfiles/c.cr2
/myfiles/d.mov

should result

/images/b_800x600.jpg
/images/b_800x600.desc
/images/c_3820x5640.cr2
/images/c_3820x5640.cr2
Paulo Tomé
  • 3,782
  • Not a full answer but you could save the results of a find /myfiles -type f to a file then use find /images -type l -exec readlink {} \; | egrep myfiles to get a list of files that are symlinked in /images then iterate over the results doing a sed on each one to delete the paths in the first file you save since they already have the symlink and the files left over are the files that don't have the symlink. – Bratchley Mar 27 '15 at 14:54
  • 2
    Is it possible for you to use hardlinks instead, so you can use the hardlink counter to see if you have links? Using ls -l you can see the hardlink count in the second column. – Lambert Mar 27 '15 at 16:05
  • Do what @Lambert says. rm those symlink dirs and use pax -rwl -s "\|.*regex|modifes_filename|" /path/to/myfiles/*.jpg /path/to/jpg_dir for example to get hardlinks with programmatically altered filenames for only those that files that match your jpgs. You can get a lot more than that out of it - like batching based on change times and etc. – mikeserv Mar 28 '15 at 01:17

2 Answers2

6

If I undersood the question correctly you need files in myfiles which do not have symlinks in images:

#!/bin/bash

OIFS="$IFS"
IFS=$'\n'

files="$(find myfiles/ -type f -name '*.jpg' -or -name '*.cr2')"
for f in $files; do
    list="$(find -L images/ -xtype l -samefile "$f")"
    if [[ "$list" == "" ]]; then
        echo "$f does not have symlink."
    fi
done

IFS="$OIFS"

There is a caveat with this approach if you have file a.jpg in directory myfiles/1 and you have a symlink to that file in directory images/3 or simply in images/ the file will not be reported with missing symlink.

taliezin
  • 9,275
3

I assume that the files under myfiles are not symbolic links, and that none of the file names contain newlines. (My approach can still work if these assumptions are violated but it gets more complicated.) I also assume that you have the common readlink utility and that it supports -f to canonicalize paths, which is the case on Linux (both GNU and BusyBox), but not on e.g. OSX.

Build a list of files, and sort it for good measure:

find /myfiles -type f -print | sort >all.list

Build a list of symbolic link targets, with absolute paths.

find /images -type f -exec readlink -f {} \; | sort >linked.list

List the files that are not linked:

comm -32 all.list linked.list

If you use a shell that supports process substitution, you can put it all in one command:

comm -23 <(find /myfiles -type f -print | sort) \
         <(find /images -type f -exec readlink -f {} \; | sort)

If the links under /images are absolute, you can use readlink without the -f option, which is available under *BSD and OSX.

slm
  • 369,824
  • I was about to post something like your comm command but I won't now that I see you already did. However, the redirect from sort make no sense; then the process substitutions will not output anything. Also, you could add many directories next to /images if you want to find files in /myfiles which are not symlinked to from any of a number of directories besides /images. – tripleee Mar 28 '15 at 18:26
  • 1
    See http://stackoverflow.com/questions/7665/how-to-resolve-symbolic-links-in-a-shell-script for alternatives to readlink -f – tripleee Mar 28 '15 at 18:27
  • @tripleee I forgot to take out the redirection when I built the process redirection version, thanks. Yes, you could add easily other directories or make other variations. – Gilles 'SO- stop being evil' Mar 28 '15 at 18:35