224

I have two directories images and images2 with this structure in Linux:

/images/ad  
/images/fe  
/images/foo  

... and other 4000 folders

and the other is like:

/images2/ad  
/images2/fe  
/images2/foo

... and other 4000 folders

Each of these folders contain images and the directories' names under images and images2 are exactly the same, however their content is different. Then I want to know how I can copy-merge the images of /images2/ad into images/ad, the images of /images2/foo into images/foo and so on with all the 4000 folders..

Nidal
  • 8,956
ssierral
  • 2,343

10 Answers10

347

This is a job for rsync. There's no benefit to doing this manually with a shell loop unless you want to move the file rather than copy them.

rsync -a /path/to/source/ /path/to/destination

In your case:

rsync -a /images2/ /images/

(Note trailing slash on images2, otherwise it would copy to /images/images2.)

If images with the same name exist in both directories, the command above will overwrite /images/SOMEPATH/SOMEFILE with /images2/SOMEPATH/SOMEFILE. If you want to replace only older files, add the option -u. If you want to always keep the version in /images, add the option --ignore-existing.

If you want to move the files from /images2, with rsync, you can pass the option --remove-source-files. Then rsync copies all the files in turn, and removes each file when it's done. This is a lot slower than moving if the source and destination directories are on the same filesystem.

  • 40
    ..add -P if you'd like to see progress.. – Meetai.com Jun 16 '15 at 06:00
  • 1
    I would add that there's no benefit to using a tricky shell loop here even if you do want to move them instead of copying them—in that case just use rsync, then rm -r /images. – Wildcard Feb 07 '16 at 08:35
  • 4
    @Wildcard, well, that's not quite the same as moving. As Gilles points out, it's a lot slower than moving if they're on the same fs; and moreover it requires a lot more temporary spae. – LarsH Jun 29 '16 at 02:11
  • 10
    I'd also like to point out that it's important to include the trailing slashes for each directory. For example, if you simply ran rsync -a images images2, it will just copy images2 into images instead of merging them. – Kyle Challis Oct 18 '17 at 01:28
  • FWIW, in the order of the directories in the command here is backward from the user's original question—as written, this answer will copy the contents of /images/ into /images2/, but the user's question had it going the other way around. – s3cur3 Jul 24 '19 at 03:44
  • 1
    @s3cur3 Oh. From 2 to (implicit) 1. That's unintuitive. Thanks, I fixed my answer. – Gilles 'SO- stop being evil' Jul 24 '19 at 06:55
  • 2
    Is there a way to make rsync ASK you if you want to overwrite or not? – Max Coplan Sep 28 '19 at 03:31
  • 3
    @MaxCoplan No. Rsync isn't an interactive tool. – Gilles 'SO- stop being evil' Sep 28 '19 at 06:00
  • Running rsync -a /path/to/source/ /path/to/destination helped me fix a problem where a mv job was interrupted. Thank you! – Peter Bergman May 25 '23 at 19:51
80

The best choice, as already posted, is of course rsync. Nevertheless also unison would be a great piece of software to do this job, though typically requires a package install. Both can be used in several operating systems.

Rsync

rsync synchronizes in one direction from source to destination. Therefore the following statement

rsync -avh --progress Source Destination

syncs everything from Source to Destination. The merged folder resides in Destination.

-a means "archive" and copies everything recursively from source to destination preserving nearly everything.

-v gives more output ("verbose").

-h for human readable.

--progress to show how much work is done.

If you want only update the destination folder with newer files from source folder:

rsync -avhu --progress source destination

Unison

unison synchronizes in both directions. Therefore the following statement

unison Source Destination

syncs both directories in both directions and finally source equals destination. It's like doing rsync twice from source to dest and vice versa.

For more advanced usages look at the man pages or the following websites:

  1. https://www.cis.upenn.edu/~bcpierce/unison/
  2. https://rsync.samba.org/
rogerdpack
  • 1,715
debiarch
  • 801
  • 3
    I want to mention that the correct path to the folder should be with the trailing slash at the end rsync -avh --progress source/ destination/ , otherwise source folder will be created in destination folder, at least in my case that was like this. – electroid Oct 23 '16 at 07:12
  • This works great for me (with the trailing slash in folders). Thank you! – Leopoldo Sanczyk Nov 02 '16 at 22:30
10

There are faster and much more space-efficient ways of merging two directories using the --link option to cp if the directories are on the same file system, described in the multiple varied answers in a related article here: (The title of the article doesn't exactly match the user's question, and the answers address the title topic, merging, more than they address the user's actual question.)

Merging folders with mv?

The --link option to cp means no file data is copied. An example of this, where everything in /images2 replaces any older items in /images is:

cp --force --archive --update --link /images2/. /images

After the merge into /images, you can then rm -rf /images2

This solution will fail if anywhere in the file tree the merge tries to merge a directory onto an existing file or symlink with the same name, i.e. it won't merge a directory named /images2/x onto an existing file or symlink with the same name /images/x and if you get such an error you can manually delete the file or symlink and just re-run the command.

The nice thing about --link is that no data is moved to merge the directories.

  • 1
    In particular it doesn't require lots of temporary space... I just needed to merge two large folders (several TBs each) and that's a much better way! – wazoox Dec 23 '22 at 11:58
8

The answer to this question is dead simple. I have used it many times.

All you need to do is ...

cp -rf source_folder parent_of_dest_folder

Using the specific file structure in the original question, the command would be...

cp -rf /images/* /images2/

The result will be a merged version of source and the destination. This will also process subdirectories recursively.

As pointed out by @ingyhere, the above commands will override existing files in the destination directory. If you wish to keep existing files, then you must ad the -n parameter to the cp command. Read more about this in the cp man page at https://man7.org/linux/man-pages/man1/cp.1.html#:~:text=links%20in%20SOURCE-,%2Dn%2C%20%2D%2Dno%2Dclobber,-do%20not%20overwrite

cp -rfn /images/* /images2/
asiby
  • 219
  • Good question. However, in a typical situation, if you are trying to merge a source directory into a destination, it means that the source files are more relevant in the event that some of them exist at the destination.

    That said, if you want to keep existing files, then use the -n option. Read more about it at https://man7.org/linux/man-pages/man1/cp.1.html#:~:text=links%20in%20SOURCE-,%2Dn%2C%20%2D%2Dno%2Dclobber,-do%20not%20overwrite. I will update the answer to reflect that.

    – asiby Aug 25 '22 at 15:37
  • According to the man page you shared, the -f option is ignored when -n is present – Kolay.Ne Mar 26 '24 at 11:06
7
for dir in images2/*; do mv "$dir"/* "images/$(basename "$dir")"; done

Loop over all the contents of images2 using an expanded glob (to avoid the problems with parsing ls) then mv the contents of those items to the matching entry in images. Uses basename to strip the leading images2 from the globbed path.

  • I know this is really old, but is there a convenient way to modify this to only copy the files that contain a certain string in their name? something like if filename like '*2160*' then mv – Chris Sandvik Jan 15 '20 at 05:38
6

This effectively merges using the cp command without overwriting:

cp -prnv /images/* /images2/

Options:

-p -- preserve ownership, perms and detailed attributes
-r -- recursive
-n -- no overwrite
-v -- verbose (display operation detail for each file)

To see what actually was copied, run it this way:

cp -prnv /images/* /images2/ | grep '\->'  # shows copies

To see the opposite of what didn't copy, use grep -v '\->'.

If you want to do something more complex, like replacing files with differences, use rsync or fly your own by testing the md5sum on both sides then copying based on the diff exit code (not recommended for large files or big directories -- use rsync).

ingyhere
  • 161
1

@inulinux12 , you can use the following one line for loop from command line:

$ for dir in images2/*; do mv "$dir"/* "${dir/2/}"; done

This will move all of the files from images2 to images in their respective directories. Note: this assumes no files have the same name.

For example:

Before execution:

$ ls -R images*
images:
ad  adfoo  fe
images/ad:
jpg.1  jpg.2
images/adfoo:
jpg.7
images/fe:
jpg.5
images2:
ad  adfoo  fe
images2/ad:
jpg.3
images2/adfoo:
jpg.6
images2/fe:
jpg.4

After execution:

$ ls -R images*
images:
ad  adfoo  fe
images/ad:
jpg.1  jpg.2  jpg.3
images/adfoo:
jpg.6  jpg.7
images/fe:
jpg.4  jpg.5
Rui F Ribeiro
  • 56,709
  • 26
  • 150
  • 232
Simply_Me
  • 1,752
  • 4
    Don't parse the output of ls. http://mywiki.wooledge.org/ParsingLs – Etan Reisner Aug 12 '14 at 23:48
  • @EtanReisner Thank you for the suggestion; seems pretty narrow scenarios though given the information given in the question. – Simply_Me Aug 13 '14 at 00:01
  • 2
    @Simply_Me While true that it will usually be fine, you really don't want it to blow up when you hit a case you weren't counting on. It can cause really bad problems. Not to mention that it is quite often (as in this case) almost trivially replaceable with a simple glob. See my answer as an example of that. – Etan Reisner Aug 13 '14 at 00:04
  • @Simply_Me The scenarios include file names with spaces, which is pretty common for images. – Gilles 'SO- stop being evil' Aug 13 '14 at 00:35
  • @Gilles and @Etan Reisner , thank you for the input, I appreciate it! Updated and tested my answer to use string substitution, and it came to be faster than using basename in this particular situation (no need to call basename for this). Thanks again for constructive comments. – Simply_Me Aug 13 '14 at 01:30
  • This command line won't move images in hidden directories or image file names starting with a dot (period) such as images/.dir or images/ad/.foo.jpg, nor will it handle images in sub-directories such as images/ad/dir/ – Ian D. Allen Dec 29 '20 at 11:23
1

merge_dirs.sh:

#!/bin/bash
# filename: merge_dirs.sh

SOURCE=/images2 TARGET=/images

cd "$SOURCE"

duplicate the directory structure into TARGET, no files

find . -type d -exec mkdir -vp "$TARGET/{}" ';'

move files from SOURCE to TARGET, skip if existing

find . -type f -exec mv -vn '{}' "$TARGET/{}" ';' cd "$OLDPWD"

optional: remove all empty directories from SOURCE

find "$SOURCE" -type d -empty -exec rmdir -vp '{}' '+'

merge_dirs.sh will give you a merged TARGET. Anything left in SOURCE was a duplicate filename. Whether to delete those is a different problem, but you can always make a script like cmp_rm.sh.

cmp_rm.sh:

#!/bin/bash
# WARNING: THIS SCRIPT DELETES THINGS
# filename: cmp_rm.sh
cmp "$1" "$2" && rm -v "$1"  -i  # -i: prompt before deletion

and run:

find "$SOURCE" -type f -exec ./cmp_rm.sh '{}' "$TARGET/{}" ';'
Pedro
  • 49
1

kdiff3 has an awesome, if slightly frustrating, interactive merge tool that can merge up to three directories into a fourth output directory.

-2

Another simple solution is to make an archive of images/:

~$ zip -1 images/* images.zip

then unpack images.zip into images2/

~$ unzip images.zip images2/
  • Can you explain why zip is useful here? – Spooler Jul 14 '23 at 20:16
  • @Spooler upon decompressing zip overlays the paths: "path1/sub1/file1.ext" is unpacked into "outpath/path1/sub1/file.ext" - essentially the two zip commands are a shortcut for the bash code "for f in $(find images/ -type f); do mv $f images2/${f/images//}; done – TheEagle Jul 14 '23 at 22:06