23

What are the consequences for a ext4 filesystem when I terminate a copying cp command by typing Ctrl + C while it is running?

Does the filesystem get corrupted? Is the partition's space occupied by the incomplete copied file still usable after deleting it?

And, most importantly, is terminating a cp process a safe thing to do?

Seninha
  • 1,035
  • 1
    Keep in mind that while the answers are correct for ext4, filesystems without journaling may not be as safe. – ave Sep 02 '18 at 01:32
  • 4
    @Ave Journaling has nothing to do with this. The syscalls are atomic regardless of what filesystem you use. Journaling is useful in situations where power may be abruptly lost. – forest Sep 02 '18 at 07:14

3 Answers3

25

This is safe to do, but naturally you may not have finished the copy.

When the cp command is run, it makes syscalls that instruct the kernel to make copies of the file. A syscall, or system call, is a function that an application can use to requests a service from the kernel, such as reading or writing data to the disk. The userspace process simply waits for the syscall to finish. If you were to trace the calls from cp ~/hello.txt /mnt, it would look like:

open("/home/user/hello.txt", O_RDONLY)           = 3
open("/mnt/hello.txt", O_CREAT|O_WRONLY, 0644)   = 4
read(3, "Hello, world!\n", 131072)               = 14
write(4, "Hello, world!\n", 14)                  = 14
close(3)                                         = 0
close(4)                                         = 0

This repeats for each file that is to be copied. No corruption will occur because of the way these syscalls work. When syscalls like these are entered, the fatal signal will only take effect after the syscall has finished, not while it is running (in fact, signals only arrive during a kernelspace to userspace context switch). Note that some signals, like read(), can be terminated early.

Because of this, forcibly killing the process will only cause it to terminate after the currently running syscall has returned. This means that the kernel, where the filesystem driver lives, is free to finish the operations that it needs to complete to put the filesystem into a sane state. Any I/O of this kind will never be terminated in the middle of operation, so there is no risk of filesystem corruption.

forest
  • 2,655
  • I tried strace cp and it seems to write in chunks of 131072 bytes. Maybe if I looked through cp's source I could see where this value comes from. – qwr Sep 01 '18 at 21:02
  • 2
    @qwr That's most likely part of the glibc library, not cp itself. It has various file access functions that internally use that as a value. – forest Sep 01 '18 at 21:03
  • 2
    Great answer! I'd never realized that there's a delay in terminating a cp after SIGKILLing it, even while dealing with large files... maybe the duration of those uninterruptible atomic operations of a process is too short. Does the same explanation work for killing dd and other disk-reading/writing processes? – Seninha Sep 01 '18 at 21:35
  • 1
    @Seninha The operations are pretty brief because the accesses are cached, so you can copy a lot more data per second than your drive can actually handle, if done in bursts. If the file is really big and on a slow medium, then the cache can fill up and killing the process can take some time. As for killing dd, that depends on what bs you set for it. If it's only 512 (the default), then it should terminate quickly. If it's larger, then it may take a bit longer. – forest Sep 01 '18 at 21:37
  • 4
    @qwr 128kb chunks are hardwired default in coreutils when reading from blockdevices, this is done in effort to minimize syscalls. Analysis is given in the coreutils source: http://git.savannah.gnu.org/cgit/coreutils.git/tree/src/ioblksize.h – Fiisch Sep 02 '18 at 11:52
  • Any I/O of this kind will never be terminated in the middle of operation, making them atomic operations. I'm probably being overly pedantic, but "atomic" seems like too strong a word to use here as it implies the operation will be done in its entirety, which isn't always the case. "Safely interruptible" might be more descriptive without the completeness implication. POSIX write(), for example, can return a partial result on any device (being interrupted by a signal is one reason...), and is by specification truly atomic only for PIPE_BUF bytes or less being written to an actual pipe. – Andrew Henle Sep 04 '18 at 16:19
  • 1
    @AndrewHenle Perhaps I should have said that it's the filesystem metadata which is atomic. You are correct that a write may be partial. – forest Sep 04 '18 at 19:25
  • @forest - What do you mean by "the fatal signal will only take effect after the syscall has finished"? Is it CTRL+C? – Motivated Jan 24 '19 at 15:50
  • @forest - What do you mean by "it's the filesystem metadata which is atomic"? – Motivated Jan 24 '19 at 15:52
  • @forest - Are there any commands that can corrupt the filesystem if is interrupted? – Motivated Jan 24 '19 at 16:18
  • @Motivated Not without bugs in the filesystem. – forest Jan 25 '19 at 02:08
  • @forest - Thanks forest. Would you mind clarifying the other questions? Happy to move them to chat if needed. – Motivated Jan 25 '19 at 05:51
  • @Motivated Sending Ctrl+C will cause your terminal driver to raise a SIGINT. – forest Jun 08 '21 at 01:35
  • why on earth isn't cp sendfile-optimized? sendfile has been present on Linux since kernel 2.2, released in 1999... – hanshenrik Aug 20 '22 at 18:45
  • 1
    @hanshenrik The busybox implementation is. I think the coreutils version doesn't use sendfile() because it usually takes advantage of ext4 holes instead, which is incompatible with that. – forest Aug 20 '22 at 20:39
  • @forest ohh it needs to copy to userland anyway to scan for holes, gotcha. fwiw holes isn't an ext4 thing, it's supported by at least XFS, Btrfs, tmpfs, gfs2, bcachefs, and as you mentioned, ext4. (source - as for the bcachefs source, i asked on the oftc #bcache channel, they confirmed bcachefs supports holes) – hanshenrik Aug 21 '22 at 11:04
23

Since cp is a userspace command, this does not affect filesystem integrity.

You of course need to be prepared that at least one file will not have been copied completely if you kill a runnning cp program.

schily
  • 19,173
  • 14
    Why the downvote? Just because it’s schily? – Stephen Kitt Sep 01 '18 at 13:54
  • 6
    There definitely seems to be at least one person that downvotes all my answers. Do you know of a way to find out who did the downvote? – schily Sep 01 '18 at 14:02
  • 2
    Not even moderators can find out who made specific votes - that is understandably restricted to SO employees. You can use the "contact us" link to ask them to investigate. – Philip Kendall Sep 01 '18 at 14:06
  • 1
    It would be pretty sad if a userspace program were able to compromise filesystem integrity. Note: Of course, there can be, there have been, and there will be bugs in filesystem implementations. Note #2: Also, of course, userspace programs running with elevated privileges (e.g. CAP_SYS_RAWIO in Linux or the equivalent in other OSs) that give them direct access to the underlying device of the filesystem (e.g. sudo dd if=/dev/urandom of=/dev/sda1) may wreak all sorts of havoc. – Jörg W Mittag Sep 01 '18 at 18:18
  • 3
    And if a filesystem was buggy enough to get corrupted after an interrupted cp, it would probably get corrupted from a finished cp too... – ilkkachu Sep 01 '18 at 19:59
  • @JörgWMittag It is sad, yet it still happens. There are definitely combinations of heavy write activity that can cause corruption on filesystems that are supposed to be entirely atomic in the event of power loss. I recall there being quite a few unfixed ones in ext4. – forest Sep 02 '18 at 06:31
  • @forest Can you briefly explain the atomicity of file copying for common Linux OS's? In the context of sudden power loss during file copy or file move. Curious, cheers. – SaltySub2 Sep 02 '18 at 07:47
  • 1
    @SaltySub2 It depends on the filesystem. For ext4, it intends to be atomic by committing transactions to the journal. For btrfs, it's copy-on-write which is naturally atomic for many operations. This atomicity is only for the filesystem state itself. You can easily get a partially-copied file if you terminate a copy or shut down the system mid-write, but you won't get a corrupted filesystem (in theory). – forest Sep 02 '18 at 07:49
  • @JörgWMittag: If you directly write to the background storage of a filesystem, this is not a filesystem usage and this of course can cause any kind of damage if you are a privileged user. Filesystem usage is when you access the data only through the official filesystem interface (open/read/write/...). Here it obviously depends on whether there are no bugs in the FS. Killing a cp program however is every day usage and should not cause any harm. BTW: It takes typically 10+ years to make a new filesystem implementation bug free. UFS took 15 years, ZFS took 10 years. – schily Sep 02 '18 at 10:39
  • @forest: ext4 is basically a UFS clone. UFS limits the max file name length to 256 bytes to permit really atomic directory write operations and orders the other write operations in a way that makes them apear to be atomic. Since UFS is using logging (this is since nearly 25 years), I did not see any corruption even when switching off the computer while writes are in proress. ext3 did frequently cause problems with power outages. If you like, do the following test: run gtar to unpack the tarball from a linux kernel on ext4, pull the power plug after the gtar command did finish. – schily Sep 02 '18 at 10:54
  • @schily I don't think ext4 is related to ufs in any way. First there was the minix filesystem which was replaced with the very similar ext. That quickly started showing its shortcomings, and ext2 was created. Someone decided to stuff a journal into ext2 and it became ext3. Finally, it was rewritten from scratch with journalling in mind and became ext4. It has no relation to ufs. – forest Sep 02 '18 at 20:54
  • 1
    Check the data structures and the fact that there are cylinder groups. – schily Sep 02 '18 at 21:31
  • The metadata structure of the original ext was inspired by that of UFS, but I doubt that it's a UFS clone anymore than it's a JFS clone for using transactional journaling or an XFS clone for using delayed allocation. – forest Sep 04 '18 at 00:23
  • @JörgWMittag Traditionally, there used to be the clri program to deliberately compromise the file system from user space. – FUZxxl Jun 12 '20 at 12:19
-1

forest's answer, albeit pretty (and in many cases correct) isn't what you'll see on modern systems⁰. They're right – under no circumstances would this corrupt your file system. But under no practical circumstances would you get half a copy these days!

Assume I do this (just to generate a large file yesfile, and copy it to a file copy (doesn't have to be on the same file system), while logging all the system calls mady by cp):

cd /tmp
yes | head -n$((10**7)) > yesfile
strace -o strace.output cp yesfile copy

I get a different picture: the userland process cp does not actually read the content of the file and does not write it to another file; that would be bad, performance-wise: it would require at least two context switches! The userland programm calls read, switch, gets data, calls write, switch; rinse and repeat if the file is larger than a single read-buffer. Now this exact repeating model, reading only a buffer of limited size, is what could lead to half-copied files on interruption.

Instead, it uses the copy_file_range system call (see trace below¹); man copy_file_range tells us:

The copy_file_range() system call performs an in-kernel copy between two file descriptors without the additional cost of transferring data from the kernel to user space and then back into the kernel. It copies up to len bytes of data from the source file descriptor fd_in to the target file descriptor fd_out, overwriting any data that exists within the requested range of the target file.

So, there is an atomic copy-this-file system call, which is usually used, so interrupting cp can not interrupt the copying.


Things get even better if your source and target file system are the same, and Btrfs, CIFS, NFS 4.2, OCFS2, overlayfs, or XFSsource (at point of writing, only for these Linux has the reflink feature): if

ioctl(4, BTRFS_IOC_CLONE or FICLONE, 3)

succeeds, the system doesn't need to copy the file contents at all – instead, just the list of blocks belonging to the source file is copied to the target file; each block has a reference counter that gets increased, so the moment any process writes to either of these files, the file system transparently does a copy-on-write on that. So, these things are even more atomic!


⁰ At least, if my GNU coreutils 8.32 with fedora 34's backported copy_file_range patches /Linux 5.13.5 are considered modern.
¹ relevant strace output

 156   │ newfstatat(AT_FDCWD, "yesfile", {st_mode=S_IFREG|0644, st_size=20000000, ...}, 0) = 0
 157   │ newfstatat(AT_FDCWD, "copy", 0x7fff982d5e70, 0) = -1 ENOENT (No such file or directory)
 158   │ openat(AT_FDCWD, "yesfile", O_RDONLY)   = 3
 159   │ newfstatat(3, "", {st_mode=S_IFREG|0644, st_size=20000000, ...}, AT_EMPTY_PATH) = 0
 160   │ openat(AT_FDCWD, "copy", O_WRONLY|O_CREAT|O_EXCL, 0644) = 4
 161   │ newfstatat(4, "", {st_mode=S_IFREG|0644, st_size=0, ...}, AT_EMPTY_PATH) = 0
 162   │ ioctl(4, BTRFS_IOC_CLONE or FICLONE, 3) = -1 EOPNOTSUPP (Operation not supported)
 163   │ fadvise64(3, 0, 0, POSIX_FADV_SEQUENTIAL) = 0
 164   │ mmap(NULL, 139264, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7f0be58ca000
 165   │ uname({sysname="Linux", nodename="workhorse", ...}) = 0
 166   │ copy_file_range(3, NULL, 4, NULL, 9223372035781033984, 0) = 20000000
 167   │ copy_file_range(3, NULL, 4, NULL, 9223372035781033984, 0) = 0
 168   │ close(4)                                = 0
 169   │ close(3)                                = 0
 170   │ munmap(0x7f0be58ca000, 139264)          = 0
  • 2
    Note also that copy_file_range doesn’t guarantee that the copy happens in a single syscall invocation, so it is still possible to get an incomplete copy. – Stephen Kitt Nov 01 '21 at 19:11
  • @StephenKitt I've got coreutils 8.32 and really can't make it copy using read/write! – Marcus Müller Nov 01 '21 at 19:18
  • Try --reflink=never to simulate the behaviour seen by us poor unenlightened users ;-). – Stephen Kitt Nov 01 '21 at 19:19
  • @StephenKitt I'll do; in the meantime I can get the read/write copying on debian oldoldstable (coreutils 8.26) – Marcus Müller Nov 01 '21 at 19:21
  • @StephenKitt That's it! 8.32, cp --reflink=never doesn't try to do copy_file_range (of course, it also doesn't try to do the actual reflink ioctl, which is the only thing I actually asked for). – Marcus Müller Nov 01 '21 at 19:23
  • I'm a bit surprised that --reflink influences copy_file_range usage – these are two separate things! – Marcus Müller Nov 01 '21 at 19:24
  • Yes, it is surprising; there’s a mention somewhere that --reflink can end up using copy_file_range as an optimisation. (And I imagine that’s how coreutils pre-9.0 can end up using it...) – Stephen Kitt Nov 01 '21 at 19:26
  • @StephenKitt yep that note is in linux/fs/read_write.c; but: notice how the "default" cp tries to use the reflinking ioctl before even trying to copy_file_range! – Marcus Müller Nov 01 '21 at 19:28
  • Yup, I don’t quite understand how 8.32 can end up calling copy_file_range explicitly... The 8.32 source code only has a reference to that function in gnulib and it isn’t used in cp! – Stephen Kitt Nov 01 '21 at 19:32
  • @StephenKitt maybe it's a Fedora 34 patch – confusing indeed. – Marcus Müller Nov 01 '21 at 19:37
  • 1
    @StephenKitt indeed, 9666248b728f3d28dcd8c58d39f03fda154feaa8 on https://src.fedoraproject.org/rpms/coreutils backports upstream patches concern copy_file_range – Marcus Müller Nov 01 '21 at 19:42
  • 1
    While this is interesting, presenting this as “on modern systems” is plain wrong as of 2021. For example, I just checked a system running on the latest Ubuntu long-time support release (20.04 — kernel 5.4.0, GNU coreutils 8.30), with default settings (so using ext4), for a copy within the same file system. cp issues read and write calls. Coreutils 8.32 from Ubuntu 21.04 on the same kernel also uses read/write. copy_file_range exists on most modern Linux systems (not *BSD and other unices), but userland doesn't use it much yet. – Gilles 'SO- stop being evil' Nov 01 '21 at 21:58
  • @Gilles yes, it will only become pervasive (on Linux distributions) once coreutils 9.0 does. – Stephen Kitt Nov 01 '21 at 22:05
  • @Gilles'SO-stopbeingevil' that's why I explicitly added a footnote to explain that. – Marcus Müller Nov 02 '21 at 10:27