Read past end of file to recover data

Question

A very old .swp file reverted a file I was editing, so it is now significantly shorter. I haven't done anything in that directory since, so the bytes immediately following the end of the file should still have my data. What function can I use to read N bytes from a given memory address? dd and read stop at file boundaries, unless I missed an option somewhere.

The current file size is 3.2 KB. I don't remember exactly how big the file was before it was truncated, but probably not more than 10 KB. How can I read 10 KB from the beginning of the file, ignoring file boundaries? It is fine if the data is not perfectly preserved, as long as I don't have to start from scratch.

score 18 · Accepted Answer · edited Apr 13 '17 at 12:36

Usually when editors save files, they delete or truncate to 0, thus freeing allocated space, and then write, which allocates new space. This results in the filesystem putting the data in a completely different physical location. So your idea might not work.

You can get the physical location of a file using filefrag or hdparm --fibmap, and then use dd to read that physical location directly. I've described this process in a different context here: https://unix.stackexchange.com/a/85880/30851

In your case it's more likely you need the general approach for finding textual data... something like:

strings -n 12 -t d /dev/partition | grep -F 'text snippet'

strings will look for consecutive ASCII data (also supports some other encodings, not sure about UTF-8. If it's code or English you won't need it) and it will also print the offset where it was found.

text snippet should be an exact, unique text sample you remember being in the part of the file you're looking for [in a single line]. (If you don't know it exactly, you could grep with regular expressions instead.)

-n 12 is the minimum length that strings will look for. 12 should be the length of your text snippet. This parameter is optional, if provided it might help strings | grep to go a little faster.

It will take a long time to read the entire partition but if successful, you'll have an offset you can feed to dd to grab the general area and then remove stuff that does not belong.

I haven't done anything in that directory since

If your directory doesn't happen to be a mountpoint... most filesystems don't really reserve space "per directory" so... any and all writes in the entire filesystem might overwrite the bit you're looking for. In a data recovery situation, you usually switch the entire thing into read-only mode.

Note that each file is stored in many blocks and they are usually not stored consecutively. So strings will only locate some parts of the file unless you're extremely lucky. — Gilles 'SO- stop being evil', Oct 31 '16 at 22:10
Quite the opposite, you'd have to be extremely unlucky to find a fragmented 10KB file. If you only find a part, it's more likely the other part was overwritten in this case. But unless you have a lot of write activity in that filesystem, or it's an SSD with instant discard, if you saved that file several times while editing, you might find many copies of that file. — frostschutz, Oct 31 '16 at 22:20
I'd recommend strings -n16 or some reasonable minimum length, to make it go faster. — Peter Cordes, Nov 01 '16 at 02:03
Thanks a bunch. There was only garbage just past the end of the file, but with strings I was able to find the entire file elsewhere in the partition. That's almost two months of work I don't have to do over, and an excellent reminder to always use version control for anything important. — Matthew Bedford, Nov 01 '16 at 14:16

Read past end of file to recover data

1 Answers1