6

In light of the answers to two previous questions, it seems that under RHEL/CentOS 7 mv even on the same filesystem is actually doing a cp then rm.

In previous editions of CentOS/RHEL, a mv on the same filesystem (even from a deep directory to a new deep directory) was very fast even on large files (say collections of installation media or large videos).

However, on my personal CentOS server, when watching what mv is actually doing when moving large files, it's taking as long as a cp followed by rm.

Which makes me wonder why the behavior has apparently changed from just being a wrapper to rename() (as per the POSIX standard).

Is this correct? And, if so, why did the mv utility change behavior in CentOS 7?

warren
  • 1,848
  • 1
    looking at the questions you linked I see no indication that mv would do cp + rm on any version when using the same filesystem. Nobody seems to claim so. Can you clarify what kind of test case you have for that? – eis Aug 23 '16 at 18:28
  • Have you checked the output of mount? I suspect there are two different filesystems involved. – Wildcard Aug 23 '16 at 19:54
  • 1
    @Wildcard it may have to do with going from one md device to another in the LVM - which would make sense of i am crossing a physical device under the hood. – warren Aug 23 '16 at 19:56

1 Answers1

10

The CentOS 7.2 mv command will try to use the rename(3) call.

eg if I do strace mv X Y then I see in the output

rename("X", "Y")                        = 0

So we can see that mv successfully called rename.

If, instead I try to rename this directory to another disk:

rename("X", "/home/sweh/X")             = -1 EXDEV (Invalid cross-device link)

We can see that mv tried to use the rename() call and this failed. At this point it starts to do recursive work

rmdir("/home/sweh/X")                   = -1 ENOENT (No such file or directory)
mkdir("/home/sweh/X", 0700)             = 0
lstat("/home/sweh/X", {st_mode=S_IFDIR|0700, st_size=4096, ...}) = 0
openat(AT_FDCWD, "X", O_RDONLY|O_NONBLOCK|O_DIRECTORY|O_CLOEXEC) = 3
getdents(3, /* 2 entries */, 32768)     = 48

Here we can see it's made the target directory, and then started to read the current directory to do the slow copy/remove.

So we can conclude that mv will try to use the fast rename() call and only fall back to the slow version if this fails.

  • clarification on initial question - I see the "cp then rm" on large files and directory trees. But just single files are [apparently] renamed – warren Aug 23 '16 at 14:54
  • I just created a tree 100 directories deep and mv X Y still just called rename(). I also created 100 subdir's of X and it still just called rename. Are you using an odd filesystem (NTFS? SSHFS?) that may not support rename() ? – Stephen Harris Aug 23 '16 at 14:58
  • Nope. ext4. And like I said in my clarification, individual files seem to rename whereas groups seem to not – warren Aug 23 '16 at 15:00
  • I also created 100 files and did mv * ../Z and saw it call rename() 100 times; once for each filename that matches the command line ('cos shell globbing expands it). – Stephen Harris Aug 23 '16 at 15:01
  • If you do mv <1millionfiles> anotherdir then it will call rename() 1 million times. Is that what you're seeing? – Stephen Harris Aug 23 '16 at 15:02
  • Maybe it has to do with depth of the tree being moved? I can replicate it on "deep" moves (say 5+ deep to 5+ deep with nothing in common but /). Shallower moves seem to be doing rename though. – warren Aug 23 '16 at 15:35
  • I did it with a 100 directory deep tree and it was a single rename() call. – Stephen Harris Aug 23 '16 at 15:37
  • Very odd. Thanks for diving in on this to see if you can replicate it :) – warren Aug 23 '16 at 17:10
  • 1
    @warren "nothing in common but /" - could it be going across non-obvious different filesystems? – Izkata Aug 23 '16 at 18:37
  • @Izkata perhaps :) – warren Aug 23 '16 at 19:03
  • 1
    @warren: use df on the src and dst, to see if they're different mount points. Note that rename(2) doesn't work even across bind-mounts, though, so even if the device column is the same for both, it can't just rename if the "mounted on" column differs. Oh nvm, just saw your comment on the question. Of course it can't just rename between two separate ext4 filesystems on different MD devices. That would mean you'd have one filesystem referring to data that's part of the other filesystem, so they'd have to know about each other. But that's the opposite of separate filesystems. – Peter Cordes Aug 23 '16 at 21:21
  • @PeterCordes - as I've gone deeper, it's not consistent behavior, which is what leads me to think it's crossing physical devices sometimes, and that's just what I happened to catch when I posted the question. – warren Aug 23 '16 at 21:26
  • @warren: What do you mean "not consistent"? Is your VFS setup so complex that it's not obvious when you're moving between filesystems (rename always fails, mv has to copy) vs. within a single filesystem (rename always works)? Maybe it copied so fast that you thought it actually just renamed, sometimes? If the data was already hot in disk cache, it will be very fast because it's just a memory->memory copy. mv has to wait for data to be read, but can exit before it's synced to disk. (Or if the files aren't too big, it can be very fast to read them in). – Peter Cordes Aug 23 '16 at 21:28
  • @PeterCordes what I've been testing against have been collections of large files (several 3GB ISOs). What it appears to be doing is sometimes the rename works, and sometimes it doesn't. The sometimes it doesn't seems to be when it's crossing a physical device under the logical volume (that sits atop a RAID). Iow, it actually is trying the rename - but it is crossing a physical boundary, it cannot (which makes sense) – warren Aug 23 '16 at 21:41