0

I am currently working on a large project (Phd thesis) using git for version control. My source files are processed by latex which are just plain-text, naturally suited to version control.

The images in the project are naturally binary files (pdf/ps/png/tiff) that do not lend well to ascii-diff version control, and leads to ever-increasing repository size.

For these types of data, my university provides me a cloud-service (box) that is just mounted as an ordinary folder in a location on my computer.

The question is the following:

Can perhaps a shell script be written that selectively copies the latest version of all files of only a specific type (eg pdf) that retains the directory structure of my original project. This script must also work bidirectionally, i.e. pull the file automatically from the cloud folder when cloning the repo to a new computer or pulling latest changes, thereby placing the latest version of the files in the correct relative paths. Version conflict must be handled automatically with a smart strategy - i.e. the file with the latest time-stamp wins. Maybe a crontab process can run?

Does anyone know of a way to manage such binary files for VCS systems across multiple machines? Git annex does not work for me. I do need the (limited) version-controlling provided to me like box/dropbox etc.

Maybe the answer is as simple as hard-linking? But I do not want to manually hard-link each file that gets generated, renamed and deleted. Instead I would like the process to be seamless.

  • I think that the first step would be to separate the concept of the source files (latex) and the built artifacts (e.g. pdf). Then there are a few questions to further clarify the best path to accomplishing your objectives:
    1. Will you have access to tools for building on every machine?
    2. Do builds take so long that efficiency demands not rebuilding on each machine?
    3. Is it every necessary that you have pre-built binaries which are different from what would be built from the latest source?
    – Nathan S. Apr 04 '18 at 19:38
  • Have you considered whether the following answer gets you closer to an acceptable solution? https://unix.stackexchange.com/questions/83593/copy-specific-file-type-keeping-the-folder-structure – Nathan S. Apr 04 '18 at 19:43
  • Do the images/figures in your document change so often that you're concerned about the version control repository size? I'd just include them in the repo and be done with it. – Andy Dalton Apr 04 '18 at 20:02
  • @AndyDalton, No..Keeping them in the same repo will not work. – Dr Krishnakumar Gopalakrishnan Apr 04 '18 at 20:22
  • @NathanS., this is a manual solution. I need something more automatic – Dr Krishnakumar Gopalakrishnan Apr 04 '18 at 20:22
  • @Krishna When you get the operation right, you can make a separate consideration of how to trigger execution based on watching events or as on time-intervals. – Nathan S. Apr 05 '18 at 14:43
  • Your own question mentions crontab. That's what will make it automatic. Find the manual solution, put it in crontab, and you have an automatic solution. The only situation where you can't do that is if your manual solution needs different parameters every time, which isn't your case from what I saw in your post. On another note, looks like the best tool for what you need is rsync, look into that. – msb Jul 08 '19 at 17:09

0 Answers0