Keep a history of all the modifications to a text file

Question

I have a plain text file (not containing source code). I often modify it (adding lines, editing existing lines, or any other possible modification). For any modification, I would like to automatically record:

what has been modified (the diff information);
the date and time of the modification.

(Ideally, I would also like to be able to obtain the version of my file at a specific time, but this is a plus, not essential).

This is surely possible with Git, but it's too powerful and complex. I don't want to deal with add, commit messages, push, etc. each time. I would simply like to edit the file with vi (or equivalent), save it, and automatically record the modification as above (its diff and its time).

Is there a tool to accomplish this in Linux?

Update: Thanks for all the suggestions and the several solutions that have been introduced. I have nothing against git, but I explicitly wished to avoid it (for several reason, last but not least the fact that I don't know it enough). The tool which is closest to the above requirements (no git, no commit messages, little or nothing overhead) is RCS. It is file-based and it is exactly what I was looking for. This even avoids the use of a script, provides the previous versions of the file and avoids the customization for vi.

The requirements of the question were precise; many opinions have been given, but the question is not - per se - that much opinion-based. Then, obviously, the same goal can be achieved through a tool or through a script, but this apply in many other cases as well.

In the past I would have said copyfs, and possibly closed the question as a duplicate of https://unix.stackexchange.com/questions/996/do-we-have-an-undo-in-linux/1004#1004 . But copyfs is unmaintained and no longer present in modern distributions. So the past answers mentioning copyfs need to be updated, and this question is a good opportunity to look for a replacement. — Gilles 'SO- stop being evil', Sep 09 '20 at 09:28
Note that if you only ever use Vim or Emacs, they can be configured to make a backup on each save. — Gilles 'SO- stop being evil', Sep 09 '20 at 09:31
If you are trying to learn git, then this may be useful https://cseducators.stackexchange.com/q/2897/204 — ctrl-alt-delor, Sep 09 '20 at 09:40
@Gilles'SO-stopbeingevil' Thanks for your observations. Vim or Emacs can keep a backup of each modification, or does each time the new backup replace the previous one, so that only the last backup is available? — BowPark, Sep 09 '20 at 09:44
@ctrl-alt-delor The link can always be useful. What I suffer from Git is (AFAIK) that I must make several steps (add, commit, push) and I can't automate them. For a plain text file, I would like to avoid this. — BowPark, Sep 09 '20 at 09:45
Git is actually quite simple, especially for local-only use with a simple history, which is what you've described. — larsks, Sep 09 '20 at 22:23
@BowPark Emacs (and probably Vim, I haven’t checked) can be configured to keep multiple backups of modified files. See the Emacs wiki for an example. — Stephen Kitt, Sep 10 '20 at 14:18
@BowPark Any given file must only be added once, and push is only necessary if you want to keep the file history at another place as well. The use case you described only needs a commit after each edit/save. — glglgl, Sep 10 '20 at 14:18
Git is very easy to automate: https://github.com/ralfholly/git-autocommit There also git-annex as a possible all-in-one solution, which encompass multiple utilities and external tools. @BowPark — Nordine Lotfi, Sep 10 '20 at 19:05

holzkohlengrill · Answer 1 · 2020-09-11T13:56:47.513

15

Give `git` a chance

I don't see why it is an issue to use a powerful tool. Just write a simple bash script that runs git periodically (via cron or systemd timers); auto-generate commit messages etc.

As others highlighted in the comments it is - of course - possible to create a local repository (see here and there for more details).

If you prefer to host your own remote repo, you'll need to set up a "Bare Repository." Both git init and git clone accept a --bare argument.

Borg backup

I can also recommend borg backup. It offers you:

Timestamps
borg diff (compare between snapshots)
Pruning (get rid of older snapshots - say you want a snapshot for the current month every day but otherwise only one per month)
Encryption (optional)
Compression (optional)
and much more...

The cool thing is that it is very flexible - it is easy to setup but give you a lot of options if you want so.

I once wrote a quick-start guide which might be of help.

edited Sep 11 '20 at 13:56

answered Sep 09 '20 at 18:30

holzkohlengrill

512

Thanks for your suggestions. I would really like to avoid git, also because I do not have a remote repository. Borg is a very interesting tool. I tried it, and checked out your guide. It seems perfectly suited for the backup of a directory with several (or thousands) of files, but not in my case. Each backup creates a different snapshot, their names must be defined each time; also, AFAIU there's no immediate way to cat a file inside a snapshot (the snapshot must be first mounted or extracted). Correct me if I'm wrong. It's an excellent tool, but I think it does not fit to this case. – BowPark Sep 10 '20 at 10:49
9

@BowPark not that git does not require a remote repository at all. git init leaves you with a fully-functional , completely local repository. – Quentin Sep 10 '20 at 12:26
3

Note that there is no requirement for a remote repository to exist when using git. – Eric Sep 10 '20 at 12:29
@Quentin @Eric I had googled several times for a only-local git repository, but without finding clear information. Thanks for this. My knowledge of git is very little, but this could be the occasion to make some practice. In the meanwhile (and for this specific case), I find the RCS solution more straightforward. – BowPark Sep 11 '20 at 10:29
@BowPark, you are right you can only compare between snapshots. For one file a local git repository is probably best. In my automation script I enter the time and date of the snapshot. On the same page you will also find a guide for creating a local git repository. ;-) – holzkohlengrill Sep 11 '20 at 13:38
Yes git is local, but can synchronise to/from (push/pull) a remote. – ctrl-alt-delor Sep 12 '20 at 18:08
The problem with git in not commit, than is easy. The problem starts when you want to do some thing useful. It is just too hard to use, too inconsistent, and easy to shoot your self in the foot. – ctrl-alt-delor Sep 12 '20 at 18:09

Nordine Lotfi · Answer 2 · 2021-03-25T16:44:45.570

There a couple ways that you could do this:

vim
emacs
git
inotify-tools
git-annex (all-in-one solution)

Which are all detailed here:

Using Vim:

Then I'd recommend using undo history, which not only (as it's name suggest) relate to the act of undoing an action in the Vim editor, but also the one you save too. More here.

Adding the following to your .vimrc:

let vimDir = '$HOME/.vim'
let &runtimepath.=','.vimDir
" Keep undo history across sessions by storing it in a file
if has('persistent_undo')
    let myUndoDir = expand(vimDir . '/undodir')
" Create dirs
    call system('mkdir ' . vimDir)
    call system('mkdir ' . myUndoDir)
    let &undodir = myUndoDir
    set undofile
endif

Will make it so every changes/undo will be permanently kept under the directory undodir under your local vimDir, which is by default either .vim in your home directory, or other ones mentioned in the output of :version or --version on the commandline.

For even more control over your undo history, I'd recommend also using Undotree to complement the experience.

Using Emacs:

There is a similar named packages called Undotree, which does similar things. More information on Undo history here.

Using Git:

I'd recommend using git-autocommit, which is a small bash script, with git as it's only dependencies, that watch the current git directory (where you launch it) for any new files/or modified files, and commit them.

Given the nature of Git it keep every changes to the file, and while it wouldn't be suited for a production/serious project, it is a useful solution if you don't mind not having commit message/generic commit message (which you can always edit/add later on).

Launch it after navigating on the wanted git directory (which is first made with git init on a specific directory, more info on the official manual) like so:

screen -dmS namehere git-autocommit -i 1 -V

if you're using screen, for tmux:

tmux new -n namehere git-autocommit -i 1 -V

otherwise:

git-autocommit -i 1 -V

will suffice if you prefer to not background it.

Using inotify-tools:

I'd recommend using inotify-tools or more specifically inotifywatch which can detect and (as it's name suggest) watch a file/directory for changes, which you can then do action on it (like save it somewhere else, etc).

Here the flag to use with inotifywatch:

inotifywait -r -m -q -e close_write --format %w%f yourdirectorypathhere

and here an example Bash script using the above:

#!/bin/bash
inotifywait -r -m -q -e close_write --format %w%f directorytowatch | while IFS= read -r file; do
process $file

done

Where process can be anything you'd want, like tar if you want to make backup on file modification, or with rclone if you want to upload it somewhere...

Using git-annex:

I'd recommend git-annex which not only encompass Git but many other external tools, like inotify-tools, bash, tar, rclone, borg etc.

More info on here.

If you feel like reading the wiki/forum later on, you can also git clone it locally, for offline reading:

git clone git://git-annex.branchable.com

for the website, forum (it's all in markdown, so it's very fast to download...), and codebase (it's in Haskell!) etc

A massive thank you also for this answer, which provides so much solutions, not only for me, but for anyone else who is interested. — BowPark, Sep 11 '20 at 10:10

jrw32982 · Accepted Answer · 2020-09-11T20:01:41.650

You could try the venerable RCS (package "rcs") as @steeldriver mentioned, a non-modern version control system that works on a per-file basis with virtually no overhead or complication. There are multiple ways to use it, but one possibility:

Create an RCS subdirectory, where the version history will be stored.
Edit your file
Check in your changes: ci -l -m -t- myfile
Repeat

If you store this text in your file:

$RCSfile$
$Revision$
$Date$

then RCS will populate those strings with information about your revision and its datestamp, once you check it in (technically, when you check it out).

The file stored in RCS/ will be called myfile,v and will contain the diffs between each revision. Of course there's more to learn about RCS. You can look at the manpages for ci, co, rcs, rcsdiff and others.

Here's some more information:

If you skip creating the RCS/ directory, then the archive will appear in the same directory as your file.
You "check in" a file with ci to record a version of it in the archive (the *,v file in the RCS/ directory). Check-in has the weird side effect of removing your file, leaving your data only present in the *,v archive. To avoid this side effect, use -l or -u with the ci command.
You "check out" a file with co to reconstitute it from the archive.
You "lock" a file to make it writable and prevent others from writing to it, which would create a "merge" situation. In your case, with only one user modifying the file, "locked" means writable and "unlocked" means read-only. If you modify and "unlocked" file (by forcing a write to it), ci will complain when you try to check the changes in (so, avoid doing that).
Since you're the only one editing your file, you have a choice of scenarios: you can keep your file read-only (unlocked) or writable (locked). I use unlocked mode for files that I don't expect to change often, as that prevents me from accidentally modifying them, because they're read-only, even for me. I use locked mode for files that I'm actively modifying, but when I want to keep a revision history of the contents.
Using -l with ci or co will lock it, leaving it writable. Without -l it will be read-only with co or it will be removed altogether with ci. Use ci -u to leave the file in read-only mode after checking its contents into the archive.
Using -m. will prevent ci from asking for a revision message.
Using -t- will prevent ci from asking for an initial message (when the archive file is first created).
Using -M with ci or co will keep the timestamp of a file in sync with the timestamp of the file at the time of check-in.
co -r1.2 -p -q myfile will print revision 1.2 of myfile to stdout. Without the -p option, and assuming that myfile is "unlocked" (read-only), then co -r1.2 myfile will overwrite myfile with a read-only copy of revision 1.2 of myfile. -q disables the informational messages.
You can create "branches", with revisions like 1.3.1.1. I don't recommend this as it gets confusing fast. I prefer to keep with a linear flow of revisions.

So, if you prefer to keep your file always writable, you could use ci -l -M -m -t- myfile. You can use rcsdiff myfile to see the differences between the current contents of myfile and the most recent checked-in version. You can use rcsdiff -r1.2 -r1.4 myfile to see the differences between revisions 1.2 and 1.4 of myfile.

The archive file is just a text file, whose format is documented in man rcsfile. However, don't attempt to edit the archive file directly. IMO, the text-based archive file, the absolute minimal extra baggage (only a single archive file), and keyword substitution are RCS's biggest strengths and what makes it a great tool for local-only, single-user, single-file-at-a-time versioning. If I were redesigning RCS, I would remove the complications beyond this scenario (e.g. multi-user, branching), which I think are better handled by more modern distributed version control systems.

As with any command, there are some quirks; you should play around with test files until you understand the workflow you want for yourself. Then, for best results, embed your favorite options into a script so you don't have to remember the likes of -t-, for example.

Thanks for your answer. I tried to read man ci for your options, but still can't understand them very well. Could you (even briefly) explain them? And: given a specific revision of the file, how can I simply print the whole file at that point (not to modify it, but only to show it)? Something like rcs print -r 1.2. — BowPark, Sep 10 '20 at 11:16
Doesn't this answer break the requirement that the changes be recorded automatically? — David Z, Sep 10 '20 at 22:56
One can use what i mentioned in my post, in conjunction with jrw32982's answer for recording automatically changes, namely, inotifywatch(from inotify-tools) @DavidZ As you can check for close_write event on file, and then use ci -l -m -t- myfile. — Nordine Lotfi, Sep 10 '20 at 23:08
@NordineLotfi Yes, exactly, and if this answer had described how to use inotifywatch to make this process automatic, I wouldn't be commenting on it. But it doesn't. — David Z, Sep 10 '20 at 23:11
disregard the comment i deleted, thought you meant this for my post :D @DavidZ — Nordine Lotfi, Sep 10 '20 at 23:18
Use -r1.2, not -r 1.2. Sorry for the quirk! :-) So, co -p -q -r1.2 myfile to print to stdout. — jrw32982, Sep 11 '20 at 02:07
Well, this brought back bad memories of just what a beast RCS was. Git seems much simpler to use. — Michael Hampton, Sep 11 '20 at 04:29
A massive thank you @jrw32982supportsMonica for the update to your answer. I struggled to understand the locked/unlocked concept, which is now more clear. — BowPark, Sep 11 '20 at 10:04
@NordineLotfi Your suggestion is very useful. Yes, with inotifywatch it can be easily automatized (or with any other one-liner; it's easy to meet this requirement even if the present answer doesn't explicitly mention it). — BowPark, Sep 11 '20 at 10:09
@MichaelHampton IMO, git's complexity is way off the charts compared to RCS. The ideas underlying git are simple enough, but the awful, inconsistent UI ruins that simplicity, turning it into an swamp of special cases. Additionally, git is about as opaque an interface as you can get; it's very hard to know what's really going on when you run each command, and all changes to your state are binary. RCS is trying to solve a much simpler problem and, although it has its flaws and quirks, is much simpler and much more transparent. I understand exactly where BowPark is coming from. — jrw32982, Sep 11 '20 at 20:10

score 7 · Answer 4 · answered Sep 10 '20 at 14:25

Another approach, which will cover all files in a given file system, is to use a log-structured file system such as NILFS.

Such file systems append changes, they don’t replace data. This is effectively equivalent to continuous snapshotting (or rather, checkpointing), and allows you to revisit the file system at various points in the past. Older checkpoints are liable to be garbage-collected once they reach a certain age, but checkpoints can be turned into permanent snapshots, and it is possible to automate that, for example to keep one snapshot per hour for the last month, then one per day for six months, then one per week etc.

NILFS is well-supported on Linux and is quite effective when used for /home.

+1 because this is very interesting, too. Thank you. I think it's too much in this case, but can be used in several other scenarios. — BowPark, Sep 10 '20 at 16:17

score 4 · Answer 5 · answered Sep 09 '20 at 22:32

Gitfs gives the best of both worlds, at least if it works for you. It provides a view of a git repository where you can edit files and every version is committed automatically.

mkdir mnt
gitfs https://example.com/repo.git $PWD/mnt -o repo_path=$PWD/working_copy

After this, you can edit files in mnt/current, and every version of the files will be automatically committed to git and will also be accessible through mnt/history/*/*.

Note that the first argument must be a remote repository. Gitfs doesn't seem to work with a local repository: if it's bare, it instructs git to access the origin remote which doesn't exist, and if it's non-bare, Gitfs tries to push to it, which fails, and Gitfs won't tell you except via a debug message so you lose all your changes.

A word of caution: gitfs seems rather buggy and poorly maintained. Silent failures are a common problem (pass -o log=-,debug=true,foreground=true,… to attempt to diagnose). You need to enable user_allow_other in /etc/fuse.conf, because the argument manipulation is buggy. It's the right concept, but I can't recommend it unless somebody takes up maintainership and fixes it up (I'm not volunteering).

score 3 · Answer 6 · answered Sep 09 '20 at 11:28

3

I used a simple Bash script, possible called vers. Synopsis:

Run my usual editor on the files passed as args. After all editing:
For each file, look for the highest name like file.V[0-9][0-9].
If different checksum, cp -p the next higher version, and chmod 400.
If no previous version, make file.V01
Declutter any excess to a _VERSIONS subdirectory.

Helpful for establishing what you changed, and on what days.

answered Sep 09 '20 at 11:28

Paul_Pedant

8,679

recommend posting the bash script in question, in a code block to help even more :D. – Nordine Lotfi Sep 11 '20 at 21:37
1

@NordineLotfi The script got left on a client site long ago, and would probably have been careless on filenames with spaces etc. I still use the manual sequence: ls -ltr to check for next version, diff to check for unwanted debug, cp -p to record time of last update, chmod 400 to prevent execution. I started a better version, but requirements creep has got me: multi-file, cksum for changes, too many options, etc. – Paul_Pedant Sep 18 '20 at 16:30

score 2 · Answer 7 · answered Sep 15 '20 at 01:29

2

You might want to take a peek at src (it stands for Simple Revision Control), a lightweight system to manage independent files (not whole projects). It is available on Debian (small wonder), or easy to install yourself.

answered Sep 15 '20 at 01:29

vonbrand

18,253

Olivier Dulac · Answer 8 · 2020-09-10T09:24:43.990

You may want to create a function that does the tedious part for you ?

For exemple (note: I do not know git yet, so I just put "placeholder git commands" that you may way to change to make it work)

myvim () { vim "$@"  # put the arguments to myvim after vim (this allows you to add options, too)
           # and then the auto-commit part here. Note: I do *not* know git yet... fix where needed
           # I assume you are in the right directory for git init...
           { git init ; gid add ; git commit ;}
           # or if git add needs the names: (may want to get rid of options in args...)
           # { git init ; git add "$@" ; git commit ;}
           # or
           # { git init ; for f in "$@"; do git add "$f" ; done ; git commit ;}
}

You may want to replace the ";" between the commands with "&&" to ensure the next step is only done if the previous one returned 0

ctrl-alt-delor · Answer 9 · 2020-09-12T18:25:04.943

0

You are correct that revision control is the way to go. Yes git is way too complex. There are other revision control tools. The modern easy to use ones are Subversion (svn) and Mercurial (hg).

subversion is non-distributed, non-locking, but supports locking, and has a global revision number.
mercurial is distributed, non-locking (is like git, but easier to use).

Both are easy to use.

What is hard about `git`

Many people have commented that git is no harder to use than other tools with respect to init, add, commit, and push. This is nearly true (except that for git to do a commit you need to do add then commit (this is a 200% of the essential effort).

The problem comes when you start doing what a revision control system is designed for. Up to this point you have gain no benefit. You are only investing in revisions, in the hope that they will be useful. In the same way that making a back up is of no benefit (restoring a backup is of benefit).

The difficulty comes with analysing history and time travelling. There are many gotchas here, and there are no good free graphical tools. svn and hg are much better behaved, have a good easy to use command line and GUI.

edited Sep 12 '20 at 18:25

answered Sep 09 '20 at 09:36

ctrl-alt-delor

27,993

I don't know subversion, neither mercurial, so I can give them a look. The problem, as for Git, may be the fact that it's not possible to automate the required steps (choose a commit message each time, commit, etc.). – BowPark Sep 09 '20 at 09:47
Git is too hard, see the comment. Mercurial works in a similar way, but is easy to use. – ctrl-alt-delor Sep 09 '20 at 09:58
4

For local-only versioning, there's even the venerable RCS ... – steeldriver Sep 09 '20 at 12:51
6

@ctrl-alt-delor: how is git init, git add and git commit hard to use? – Arkadiusz Drabczyk Sep 09 '20 at 19:45
2

I don't see a solution here to avoiding the "add", "commit", or "push" steps, which is what the question asks for. Regardless of whether or not you personally feel something is "easy to use", does it avoid those steps? If so, could you please [edit] your answer to explain? – Dan Getz Sep 09 '20 at 20:23
7

"Yes git is way to complex. There are other revision control tools. The modern easy to use ones are. Subversion (svn) and mercurial (hg)." - The basic steps that the OP needs are virtually identical in subversion, mercurial, and git. Your "git is too hard claim" seems without merit, for this situation. – marcelm Sep 09 '20 at 20:37
@BowPark Have your bash-script-that-wraps-the-edit-with-version-control accept a message stub or just put in a (redundant) username or timestamp. "auto-saved by script" or some such. – mpez0 Sep 10 '20 at 00:21
1

fossil is another version control system, similar to git, svn, mercurial – jrw32982 Sep 10 '20 at 02:54
@mpez0 Sorry, I can't understand your question. Can you please rephrase it? – BowPark Sep 10 '20 at 07:29
@BowPark Your problem with git is the requirement for a check-in comment. Have the script (that does check-out, check-in, start editor, etc.) provide one. – mpez0 Sep 10 '20 at 13:51
@BowPark git can be automated: use bash scripts. – ctrl-alt-delor Sep 12 '20 at 18:12
1

@ArkadiuszDrabczyk init, add and commit are not hard. add followed by commit commit is one two many steps (100% more steps than needed). The real problem is when you try to do the more interesting stuff. I do wonder if most people ever get to the interesting stuff. It is just too much brain tax with git. – ctrl-alt-delor Sep 12 '20 at 18:15

score 0 · Answer 10 · answered Mar 31 '23 at 16:22

0

? notepad++ according to this thread, which seems semi reliable https://answers.microsoft.com/en-us/windows/forum/all/notepad-edit-history/29de4e36-fd5b-48f2-8aea-b92992f9f8ba

answered Mar 31 '23 at 16:22

Ezra Abrams

1

2

While this link may answer the question, it is better to include the essential parts of the answer here and provide the link for reference. Link-only answers can become invalid if the linked page changes. - From Review – Peregrino69 Mar 31 '23 at 16:43

Keep a history of all the modifications to a text file

10 Answers10

Give `git` a chance

Borg backup

Using Vim:

Using Emacs:

Using Git:

Using inotify-tools:

Using git-annex:

What is hard about `git`

Linked

Keep a history of all the modifications to a text file

10 Answers10

Give git a chance

Borg backup

Using Vim:

Using Emacs:

Using Git:

Using inotify-tools:

Using git-annex:

What is hard about git

Linked

Give `git` a chance

What is hard about `git`