0

We have Linux Machine 1 - Directory1 (include all sub directories and files) We have Linux SFTP Machine 2 - Directory2 (include all sub directories and files)

I am looking for a solution / tool which can keep both directories in ABSOLUTE sync using SFTP only. If any file / folder gets added / modified / deleted on one side, it should replicate it on other side.

Solutions I cannot use : rsync - requires ssh connection (not possible because of security concerns) Solution I tried - lftp -> works smoothly for 1 way sync when only 1 of machine is writing / modifying files / folders. But lftp works erratically when both Machine 1 / Machine 2 are writing as well as modifying / deleting. Tried Apache Nifi , but thats an overkill and looking for a simpler solution , is possible.

Your help is highly appreciated!

  • 1
    What are you using to keep them in sync currently? Show us the code. // Don't sftp and rsync both use port 22 ssh access to move the bits? – J_H Aug 14 '23 at 23:41
  • With the restrictions as described any solution is going to be "expensive" in resources. For example you'll need a busy loop running on the local system to list the files on the remote – Chris Davies Aug 15 '23 at 07:15
  • 1
    It is possible to restrict certificate-based inbound ssh connections to a single command (unison, for example). Lock that in a chroot and you could be good to go. Would that satisfy your security concerns? – Chris Davies Aug 15 '23 at 07:19
  • I've certainly used servers configured to allow only rsync when accessed using my key (plain key, not CA-signed), and that's what I would recommend. – Toby Speight Aug 17 '23 at 12:02

1 Answers1

0

On machine1 use a find . -type f | sort > files-src.txt command to produce a list of relevant directory1 filenames. Put file lengths and sha224 file content hashes into that or a related file. Consider using find's convenient -ls switch.

Notice that "files-src.txt" will be transfered to directory2 on machine2.

Over on machine2 produce a local "files-dst.txt" file, based only on local file reads.

Now you're in a position to compute differences between files-{src,dst}.txt, similarly to what rsync does in an online context. Deleting dst files which don't appear in src is a piece of cake. Next, ignore dst files whose (length, hash) match the src files. (A hash can only match if the lengths match.) Finally, schedule transfers of all src files which either

  • don't have a corresponding dst file, or
  • have a dst file with a hash mismatch
J_H
  • 866