Retain only last N bytes of pipe?

Question

I'm piping data from a socket to a file. If I need to save all the data, this is easy enough:

$ nc -l 5001 > file.bin

If I want to save only the first n bytes, that's easy enough as well:

$ nc -l 5001 | pv -Ss 2M > file.bin

Is there a way I can pipe the data to a file with a fixed maximum size, such that when new data arrives it displaces the old data in the file like a queue? Perhaps using a fixed-size named pipe if such a thing exists.

Ideally this would run continuously and not require saving all of the file, then chopping it. In other words I don't want to simply run tail after the whole file is saved.

You could pipe to tail and redirect the output just like you're piping to pv here. — muru, Oct 28 '22 at 04:15
So you want to build a real FIFO, right? What is the size of item you want to push through the FIFO? I.e., is it really OK to throw away individual "oldest bytes"? Won't that break the usefulness of that data? — Marcus Müller, Oct 28 '22 at 04:32
Imagine a streaming case like video or something -- I want to inspect the stream at a particular point in time, and only see what was the most recent. — bcattle, Oct 28 '22 at 04:53
tail -f wouldn't work because -f appends all new data to the output — bcattle, Oct 28 '22 at 13:24

score 1 · Answer 1 · answered Oct 28 '22 at 11:47

1

For n = 2M, use this:

nc -l 5001 | tail -c 2M > file.bin

Or use this:

nc -l 5001 | tail -c 2097152 > file.bin

answered Oct 28 '22 at 11:47

James Risner

1,282

score 1 · Answer 2 · answered Oct 28 '22 at 13:01

So, do I get it right that you want to be able to view the last N bytes at any time, without having to store significantly more than that?

If you want that in actual real-time, as in byte-by-byte, you'll probably need some dedicated software for it. Often, one would store the latest data in a ring buffer, but you probably couldn't find software that could read data arranged like that from a file. Another option would be to just write all the data to a file the normal way, and periodically tell the OS to discard the earlier data that's no longer needed. (I think I saw a comment suggesting that earlier, but maybe it was deleted.)

You could probably do that with a Perl script that fallocate to do the discarding.

If you don't need the very latest data, a quick and dirty shell solution would store fixed chunks in a file, renaming the file to another name between each chunk. The latest complete chunk would then be always available in the second file.

E.g.

i=0
while true; do echo $i; sleep .1; i=$((i+1)); done |
    while true; do
        head -10 > out1
        mv out1 out2
    done

The first loop produces some test output, while the second reads chunks of 10 lines so that the last complete chunk is in out2 and the last partial chunk in out1. Change the argument to head as needed.

Note that with a small chunk like that, output buffering in head will make out1 always appear empty. Use stdbuf -o0 head ... instead if that's an issue.

Retain only last N bytes of pipe?

2 Answers2