0

I have computer 1 logging voltage data to a file volts.json every second.

My second computer connects via ssh and grabs that file every 5 minutes. Splunk indexes that file for a dashboard.

Is scp efficient in this manner, if so then ok. Next is how to manage the file and keep it small without growing to 2mb lets say? is there a command to roll off the earlier logs and keep the newest?

the json looks like this right now:

{
  "measuredatetime": "2022-06-27T18:00:10.915668",
  "voltage": 207.5,
  "current_A": 0.0,
  "power_W": 0.0,
  "energy_Wh": 2,
  "frequency_Hz": 60.0,
  "power_factor": 0.0,
  "alarm": 0
}
{
  "measuredatetime": "2022-06-27T18:00:11.991936",
  "voltage": 207.5,
  "current_A": 0.0,
  "power_W": 0.0,
  "energy_Wh": 2,
  "frequency_Hz": 59.9,
  "power_factor": 0.0,
  "alarm": 0
}
Philippos
  • 13,453
Tom
  • 129
  • Yes, scp is ok, you might want to add -C if you're not doing it already, since that kind of data will be compressed a lot. The other questions depend on the program doing the logging and the program doing the rendering. You could also mount via sshfs and follow the file directly, for example. – Eduardo Trápani Jun 28 '22 at 00:49
  • How is the data logged? Is the program that logs the data keeping the file open, or does it re-open the file for writing every time. Is it seeking to the end and then writing, or is it appending? How crucial is it that you get every data point, or is it ok if one or two entries are dropped at the time of rotating the log? – Kusalananda Jun 28 '22 at 07:13
  • You can use rsync, as stated here rsync support lots of compressions – k.Cyborg Jun 28 '22 at 11:57
  • -EduardoTrápani thanks, didnt know about sshfs that is actually cool.

    -Kusalananda Its a python script file write append mode

    Thanks, I thought about rsync but wasnt sure -k.Cyborg

    – Tom Jun 28 '22 at 12:21

1 Answers1

1
  • To keep directories synchronized through ssh,the typical tool is rsync.
  • To roll log files and save space, logrotate is well dedicated.
  • To secure an unattended simple task through ssh, .ssh/authorized_keys with forced command is an excellent practice.

Example:

  • set /etc/logrotate.d/volts file (imitate classical syslog settings)

  • create a task-dedicated key pair with ssh-keygen; in this particular case, you do not want a passphrase; security is ensured by autorized_keys restrictions

  • in .ssh/authorized_keys, set:

    command="rsync --server --sender -logDtpre.iLsf . /path/to/volts/" ssh-rsa AAAAB3NzaC1yc2E[...pubkey...] blabla
    
  • on the other side, in crontab, set

    rsync -e "ssh -i /path/to/privatekey" -a otherhost:/path/to/volts/ /path/to/volts
    

On computer 1, you could also replace the log file by a named pipe, make a daemon script that consumes the stream and writes safely to a file (e.g using a semaphore to manage concurrent I/O), so that you have a good control over the data integrity.

  • Thanks for the details. I was thinking about logrotate but never used it and didnt know all the options, coming back this morning it makes sense reading your suggestion. Also rsync I thought about but didnt want to complicate syncing but looks elegant. I do have ssh working with keys so that makes scp working good. I am logging every 10s but want to do 1s. For the past 12 hours have 1.4M uncompressed file it is syncing. But as splunk indexes every 5 min (now) I dont need previous data so I can probably set the logrotate every hour and deal with KBs. – Tom Jun 28 '22 at 12:29
  • @Zippy if disk space is critical on computer 1, you should make a ssh servlet (using authorized_keys forced command) that read-and-reset the logfile, provided that the concurrent access between write and consume is managed. So that computer 1 has no logrotate need and disk space is kept as small as possible. It's a simple FIFO design. Depending on what you know about the writer process. – Thibault LE PAUL Jun 28 '22 at 13:38
  • If writer process uses shell >> redirection or similar, consumer may just rename-read-unlink the logfile. – Thibault LE PAUL Jun 28 '22 at 13:56