split/reference big file by offset reference

Question

split splits a file into pieces which in total consumes the same storage space (doubling the consumed disk space).

ln can create a symbolic link (symlink) to other (target) file while not duplicating the file and thus does not consume double the space of the target file.

due to the lack of storage space, can a file be split by reference/symbolicly (i.e. virtually splitting the file) that points to specific offsets in the big file?

for example, given a file which is 2MB break it to 2 pieces, where each piece reference 1mb of the big file (in the same concept that symlink works), such that each piece:

does not overlap other pieces (pieces will not reference the same data in the big file)
does not consume the same storage size as the big file portion it references

piece_1.file -> 2mb.file 1st MB
piece_2.file -> 2mb.file 2nd MB

and the storage size of each piece is much less than 1MB

Is your intention really to exclude a single bit? bit addressing isn't possible (in general), and I think you mean "1 MB from the 0. byte" and "1 MB from after the 1. MB", right? — Marcus Müller, Mar 14 '23 at 07:06
@MarcusMüller correct. was just giving an example. depends if the reference is inclusive or not. in general, i just want to split without duplicating the target size and without any aliasing (referencing the same location, not to read the data twice) — Mr., Mar 14 '23 at 07:08
I don't understand what you're saying in that comment. Could you please edit your question and explain what you want to achieve in general, instead of here in special? I'm writing an answer, and I think your general question might fundamentally change what I should be writing. (and remove the +1bit? that is really a very hefty complication of your situation and essentially would need a completely different approach which involves mounting some rather complicated and slow file system layer that you would have to code yourself.) — Marcus Müller, Mar 14 '23 at 07:21
Hard links are just distinct names to a single file, with the file itself being "whole". For them to to work, the inode (which is basically the structure describing the file), needs to keep a reference count, so the system knows when the last name is gone and can delete the file. Regular filesystems don't have that on the block level. — ilkkachu, Mar 14 '23 at 08:43
Though technically, if you're mad enough, you could mess with the filesystem internals so that you have multiple files that point at the same blocks. It'd likely break down horribly if one of the files was deleted or truncated. Some advanced filesystems have such features, though, e.g. Btrfs has reflinks, which are basically files that share data blocks, in a controlled manner. The docs say they could in principle be used for partial ranges too. https://btrfs.readthedocs.io/en/latest/Reflink.html Fixing something up with FUSE would also be possible (non-answer, since not really a solution.) — ilkkachu, Mar 14 '23 at 08:45
@ilkkachu well, it works pretty well at least on XFS (and seeing that offers the same map_file_range methodology in the VFS, it should work just the same with btrfs) when you copy_file_range. Added a demo to the end of my post to show that! — Marcus Müller, Mar 14 '23 at 09:07
@MarcusMüller, right, I meant I didn't have anything concrete, not that it wouldn't be possible to work a useful solution from that. :D — ilkkachu, Mar 14 '23 at 09:35

Marcus Müller · Accepted Answer · 2023-03-14T20:15:18.183

due to the lack of storage space, can a file be split by reference/symbolicly (i.e. virtually splitting the file) that points to specific offsets in the big file?

Not directly, no. Files don't work that way in the thinking of POSIX. They're more independent, atomic units of data.

Two options:

Loopback devices

This is a runtime solution, meaning that it's not an on-disk solution, but needs to be set up manually. That might be an advantage or a disadvantage!

You can set up a loopback device quite easily; if you're using a freedesktop system message bus-compatible session manager (i.e., you're logged in to your machine graphically and are running gnome, xfce4, kde,…), udisks is your friend:

blksize=$((2**20))
udisksctl loop-setup -s $blksize -f /your/large/file
udisksctl loop-setup -s $blksize -o $blksize -f /your/large/file

The first command gives you a /dev/loop which starts at the 0. byte and goes on for 2²⁰ bytes (i.e., a megabyte).
The second command gives you a /dev/loop+ which starts at the 2²⁰. byte and goes on for 2²⁰ bytes (i.e., a megabyte). (Note how we're starting to count at 0, so that this is actually exactly after the first chunk.)

You can then use these two loopback devices, e.g. /dev/loop0 and /dev/loop1. They literally describe a "view" into your file. You change something in these block devices, you change it in your large file.

If you're not logging in graphically, the exact same can be achieved, but you need root privileges:

blksize=$((2**20))
sudo losetup --sizelimit $blksize -f /your/large/file
sudo losetup --sizelimit $blksize -o $blksize -f /your/large/file

Reflinking the storage blocks

This is an on-disk solution. It will also make a "logical" copy, i.e., if you change something in your small files, it will not be reflected in the large file.

You must use a file system that supports reflink (to the best of my knowledge, these are XFS, Btrfs and some network file systems). (File system blocks through which a split goes would need to be duplicated, but for most filesystems, we're talking less than 4 kB here.)

In that case, your file system can copy files without the copy using any space of its own, as long as the copy or original aren't changed (and even if they are, only the affected parts are duplicated).

So, on such a file system, we have two options:

make the split into the first and the second half "normally", and then ask a utility (duperemove) to compare the three files and deduplicate.
make the copy of the two halves of your file in a manner that hints the file system to directly avoid using twice the space.

Since the first option temporarily needs twice the space, let's do the second right away. I wrote a small program to do that split for you (full source code) (Attention: Found a bug, which I'm not going to fix. This is just an example). The relevant excerpt is this:

// this is file mysplit.c , a C99 program
// SPDX-License-Identifier: Linux-man-pages-copyleft
// (heavily based on the copy_file_range(2) man page example)
// compile with `cc -o mysplit mysplit.c`
// […]
int main(int argc, char* argv[])
{
[…]
        ret = copy_file_range(fd_in /*input file descriptor*/,
                              &in_offset /*address of input offset*/,
                              fd_out /*output file descriptor*/,
                              NULL /*address of output offset*/,
                              len /*amount of bytes to copy*/,
                              0 /*flags (reserved, must be ==0)*/);
[…]
}

Let's try that out. We'll have to:

get and compile my program
make a XFS file system first, as that's one of the file systems that supports reflinks
mount it
put a large file filled with random data inside
check the free space on that file system
add splits of the file (using my program)
check the free space again.

The script below does just that.

# replace "apt" with "dnf" if you're on some kind of redhat/fedora
# replace "apt install" with "pacman" followed by a randomly guessed amount of options involving the letters {y, s, S, u} if you're on arch or manjaro
apt install curl gcc xfsprogs
Download the C program from above
curl https://gist.githubusercontent.com/marcusmueller/d1e0235f9a484cb44626e35460a5c0ac/raw/6295f9f6371b916b87a5d0a5a6edad65f9ea8627/mysplit.c > mysplit.c
Comile
cc -o mysplit mysplit.c
Demo:  (doesn't need root privileges if you got udisks)
Make file system in a file system image,
Mount that image
create a 300 MB file in there,
split it,
show there's still nearly the same amount of free space
Make file system in a file system image
fallocate -l 1G filesystemimage # 1 GB in size
this is a bit confusing, but to make the root of the file system
world-writable, we do:
echo "thislinejustforbackwardscompatibility/samefornextline
1337 42
d--777 ${UID} ${GID}" > /tmp/protofile
mkfs.xfs -p /tmp/protofile -L testfs filesystemimage
Mount that image
loopdev=$(LANG=C udisksctl loop-setup -f filesystemimage | sed 's/.* as (.*).$/\1/')
on most systems, that new device is automatically mounted.
sleep 3
udisksctl mount -b "${loopdev}"
create a 300 MB file in there,
target="/run/media/$(whoami)/testfs"
rndfile="${target}/largefile"
dd if=/dev/urandom "of=${rndfile}" bs=1M count=300
Check free space
echo "free space with large file"
df -h "${target}"
split it,
(copy the first 100 MB)
./mysplit "${rndfile}" "${target}/split_1" 0 $((2*20  100))
(copy the next 120 MB, just showing off that splits don't need to be of uniform size)
./mysplit "${rndfile}" "${target}/split_2" "$((220 * 100))" "$((220 * 120))"
show there's still nearly the same amount of free space
Check free space
echo "free space with large file + splits"
df -h "${target}"

An addition to the reflink approach is that ZFS and BTRFS allow block-level deduplication. While a reflink file on ext4 is copied once you change it, BTRFS will only copy the changed blocks. To deduplicate existing blocks one can just jdupes -B. — allo, Mar 15 '23 at 13:55
@allo didn't even know ext4 had file-level reflinking at all! — Marcus Müller, Mar 15 '23 at 13:56
I think it did not have it in its first revisions, but now it has had it long enough that one can assume that most systems support it. — allo, Mar 15 '23 at 14:04
apparently, there is a limitation on the amount of loopback devices that can be created :( — Mr., Mar 17 '23 at 05:08
@Mr. I don't think that's true, see https://unix.stackexchange.com/questions/554438/what-is-maximum-loop-devices-for-linux-kernel — Marcus Müller, Mar 17 '23 at 05:23

score 7 · Answer 2 · answered Mar 14 '23 at 08:03

7

On Linux, it's the kind of thing you can do with loop devices.

For instance:

losetup --find --show             --sizelimit=2M file
losetup --find --show --offset=2M --sizelimit=2M file
losetup --find --show --offset=4M --sizelimit=2M file

Would output the paths of 3 loop devices that reference 3 2MiB large sections of the file.

answered Mar 14 '23 at 08:03

Stéphane Chazelas

544,893

2

Ah, I didn't realize losetup supported M as offset (man page does mention it, though). – Marcus Müller Mar 14 '23 at 08:16

superurmel · Answer 3 · 2023-03-14T07:06:22.130

If I understood your right this comes close to your question. Isn't it?

supu@devuan ~/TEST ❖ cat bigfile.txt                                                               8:01
DIES IST VIEL TEXT
DIES IST VIEL TEXT
supu@devuan ~/TEST ❖ split -n2 bigfile.txt
supu@devuan ~/TEST ❖ ls -l                                                                         8:00
insgesamt 12.288
-rw-r--r-- 1 supu supu 38 2023-03-14 07:59 bigfile.txt
lrwxrwxrwx 1 supu supu  3 2023-03-14 08:00 piece1 -> xaa
lrwxrwxrwx 1 supu supu  3 2023-03-14 08:00 piece2 -> xab
-rw-r--r-- 1 supu supu 19 2023-03-14 08:00 xaa
-rw-r--r-- 1 supu supu 19 2023-03-14 08:00 xab
supu@devuan ~/TEST ❖ cat piece?                                                                    8:00
DIES IST VIEL TEXT
DIES IST VIEL TEXT
supu@devuan ~/TEST ❖ cat xa?                                                                       8:01
DIES IST VIEL TEXT
DIES IST VIEL TEXT
supu@devuan ~/TEST ❖

i would like to avoid the duplication of the big file data. seems like total size of xaa and xab (combined) as the same size as bigfile.txt — Mr., Mar 14 '23 at 07:06

split/reference big file by offset reference

3 Answers3

Loopback devices

Reflinking the storage blocks

Download the C program from above

Comile

Demo: (doesn't need root privileges if you got udisks)

Make file system in a file system image,

Mount that image

create a 300 MB file in there,

split it,

show there's still nearly the same amount of free space

Make file system in a file system image

this is a bit confusing, but to make the root of the file system

world-writable, we do:

Mount that image

on most systems, that new device is automatically mounted.

udisksctl mount -b "${loopdev}"

create a 300 MB file in there,

Check free space

split it,

(copy the first 100 MB)

(copy the next 120 MB, just showing off that splits don't need to be of uniform size)

show there's still nearly the same amount of free space

Check free space

Linked