0

I want to run a postgres dump from a remote server which will take a couple of hours. I am using a bash script on the target to do so and start it with at

Now, in order to prevent things from breaking I use nohup, but am not sure whether this is even needed (due to starting the script with at) and whether it would be better to use in in the pg_dump command directly (see below) or rather starting the script with nohup and skip it in the command itself.

command in the script currently is:

nohup nice pg_dump -h ${SOURCE} -d ${DB_NAME} -U ${DB_USER} -v -Fd -j 2 \
-f ${DUMPDIR}/${DUMPFILE}_"${NOW}" >  ${DUMPDIR}/${DUMPFILE}_"${NOW}".out
vrms
  • 139
  • 1
    why would you want to use nohup with at? There's no interactive terminal associated with things run from at anyways. Wo, which problem are we actually solving here? – Marcus Müller Jun 08 '23 at 10:08
  • 1
    nice isn't likely to be of much use, either, as pg_dump will be mostly I/O bound. The real concern is all those unquoted variables in the script. See $VAR vs ${VAR} and to quote or not to quote and Why does my shell script choke on whitespace or other special characters?. Also Security implications of forgetting to quote a variable in bash/POSIX shells – cas Jun 08 '23 at 10:39
  • BTW, this could be sped up by a) writing the .out file to a different disk (not just a different partition, but a different drive, perhaps even to another system via ssh) and b) piping the output of pg_dump through gzip or xz or some other compresser before redirecting to the output file (trading I/O for CPU). Another way to speed up pg backups it to replicate the database(s) on it to a secondary server, and run your backups on the secondary - the primary server will have an on-going small hit to I/O performance but won't be affected at all during the backup. – cas Jun 08 '23 at 10:44
  • b)I'd advise a bit against piping through xz which isn't really known for its speed, so it might introduce a new bottleneck.gzip is faster, but still single-threaded (use Adler's pigz instead whenever possible), but the compression/speed ratio is still pretty bad by modern compressors' standards. zstd -3 typically outperforms gzip/zlib in any aspect and way: faster than gzip -1 (and any higher gzip compression setting), and better at compressing than gzip -9 (and any faster gzip seeing). If you need even faster compression than zstd, lz4 is also an option for some requirements. – Marcus Müller Jun 08 '23 at 11:11
  • I think the replication approach is the most correct one, isn't it? Replicate, freeze, dump. But replicating usually needs an initial backup of the main databases, so you're back to needing to dump that? – Marcus Müller Jun 08 '23 at 11:12

1 Answers1

2

While using nohup is useful for allowing a job to continue beyond your session in interactive usage, that does not apply here. The only reason the process is likely to receive a HUP is because:

  1. Something has gone wrong and the backup being generated will be of no use

  2. The host is shutting down or going into maintenance mode

In such cases you really do want the program to stop gracefully - so I would suggest that including 'nohup' here is worse than redundant. It actually poses a risk to your integrity / availability.

symcbean
  • 5,540