One problem with simply performing a full copy of files is that there is the possibility of getting inconsistent data. It usually works this way
Here's an example of a file inconsistency. If a collection of files, file00001-fileNNNNN depends on each other, then an inconsistency is introduced if one of the files changes in mid-copy
- copying file00001
- copying file00002
- copying file00003
- file00002 changes
- copying file00004
- etc...
In the above example, since file00002 changes while the rest are being copied, the entire dataset is no longer consistent. This causes disaster for things like mysql databases, where tables should be consistent with their indexes which are stored as separate files...
Usually what you want is to use rsync to perform a full sync or two of the filesystem (minus stuff you don't want, such as /dev, /proc, /sys, /tmp). Then, temporarily take the system offline (to the end-users, that is) and do another rsync pass to get the filesystem. Since you've already made a very recent sync, this should be much, much faster, and since the system is offline - therefore, no writes - there's no chance of inconsistent data.