1

I have a script that does a complicated shutdown and restore procedure. It has some long running commands executed in it. Some take about ten minutes, some are faster.

Is there some piece of software that will allow me to put all of these commands in a file and execute them with checkpointing. For example, it will allow me to execute them one by one and remember which was the last one executed, or just run them all and if one of them breaks, I'll be able to retry from that point onwards.

Petko M
  • 113

1 Answers1

1

After executing each long-running command, you need to remember that it's been executed. The way to remember that something has happened is to store the information in a file.

If the effect of one of these commands is to create a file, then you can test for this file's presence. There are a couple of things to beware of:

  • If the command creates an output file and populates it in several stages, and it fails in the middle, then you'll end up with a partial file. The remedy is to make the command write to a temporary file, and rename the temporary file to the final name after the command finishes successfully.
  • If the output file already existed before the command ran, you need a way to tell whether the output file is up-to-date. This is typically resolved by comparing the modification time of the input files and output file: if the output file is newer than the input files, this indicates that it has been regenerated.

All of this is the basic job of build automation tools. You can use the traditional tool, make. It's built around the concept of “if file B is built from file A, and A is newer than B or B doesn't exist, then run the command to generate B from A”. The syntax for that is

B: A
        command-to-generate-B-from-A

where the second line starts with a tab character, not with spaces.

If the command doesn't create its output file atomically, then make it write to a temporary file:

B: A
        mycommand <A >B.tmp
        mv B.tmp B

Note that if mycommand fails, make won't execute the second command, so B is not created.

To create B, run the command make B. If B already exists and is newer than A, make won't run mycommand again.

Make chains rules automatically. If you have a rule to build B from A and a rule to build C from B, and only A exists, then make C automatically builds B then C.

If there's no file that shows that the command has run, you can create an empty one just to remember.

command1.timestamp:
        command1
        touch command1.timestamp
command2.timestamp:
        command2
        touch command2.timestamp

Running make command2.timestamp does nothing if command2.timestamp already exists. Otherwise it first runs command1 unless command1.timestamp already exists, then runs command2.