1

I have a directory on an CIFS share with 10000+ files that contains a number of server log files in the format CCLLLTTTFFFFNNN YYYY-MM-DD at a minimum. Where:

  • Server name consisting of:
    • CC = ISO country name
    • LLL = Location (IATA code of closest city)
    • TTT = WIN or UNX (and why it's a CIFS share)
    • FFFF = Function of the server
    • nnn = number
  • a space
  • a date
  • sometimes some more text containing words with spaces
  • an extension (always)

Someone who's no longer working for the company set this up and all servers globally dump their daily logs there and it takes forever to load the list of files! Everyone who needs logs whines and bitches grumbles about it, but no one ever does anything so I started doing something about it just for me.

The idea:

Instead of a long list of files, why not have a short list of directories at least 2 orders of magnitude shorter with the server names in them and cron a script daily that moves all these files into the directory? ¹

What do I have?

  • bash
  • Access to gcc
  • Write access to the CIFS share (obviously)
  • Manjaro, an Arch derivative
  • OpenOffice

What have I done so far?

  • ls /mnt/logshare/*UNXSAP* > ~/Documents/logs/logshare.txt
  • Import logshare.txt into OpenOffice Calc
    • create a directory with the server name
    • Generate a ton of mv commands using Calc and formulas
  • copy-paste that into a shell-script
  • execute shell-script

But:

  • I've become a victim of my own success
  • The security and application group have seen my directories crop up and want me to not be such an egoist and do that for everyone.
  • No real devs, no real scripters available.
  • I've been thinking about this for a week and wouldn't even know where to start. awk? find? Start writing C-code again? (Haven't done that in 20+ years. Unfortunately, I've become what I always dreaded: a suit...) ;-(
  • Whenever a new server gets added, a directory should be created automatically
  • script should be run daily

Is there anyone out there who has solved this already for their own server / data file collection and has such a bash script (C-source?) handy already that I can modify? and if not: helpful hints, please?

Note 1: Yes, the intelligent thing to do would be for the servers to dump their logs into a directory named after the server name, but that's a roll-out, a CAB, and other head-aches like mobilising all the world-wide server admins...

muru
  • 72,889
Fabby
  • 5,384
  • @glenn-jackman My apologies and thanks. :+1: – Fabby Sep 14 '21 at 23:45
  • 1
    What would these subdirectories be called and what criteria should we use to identify which log files are you be moved to which directories? Have you considered that it would make more sense to have someone in to modify the logging itself to write directly to these new subdirectories? – Chris Davies Sep 15 '21 at 00:06
  • @roaima Thank you, clarified question: Directory name = name of server. Yes, thought of that; see Note 1 – Fabby Sep 15 '21 at 00:08
  • 1
    You don't seem to define the "server name" anywhere explicitly. Can I infer that it's CCLLLTTTFFFFNNN? – Chris Davies Sep 15 '21 at 00:13
  • Thanks again. Adapted again. I should go to sleep @roaima :zzz: – Fabby Sep 15 '21 at 00:17
  • 7
    I'd be more inclined to try to answer this if it didn't have the lame pr0n joke. This should be relatively easy to do with a shell, awk, or perl script, without needing openoffice. BTW, Don't Parse ls - in short, use find instead of ls. – cas Sep 15 '21 at 00:45
  • @muru I wouldn't have expected you of all people to have a SOHF... ;-) – Fabby Sep 15 '21 at 09:30
  • 1
    @Fabby I did wait till you got a couple of answers :P – muru Sep 15 '21 at 10:28
  • @cas In the mean time it's been removed by the powers that be but it's not a joke: it's reality... – Fabby Sep 15 '21 at 11:18

2 Answers2

3

Creating directories for each server

find /var/mnt/logshare/ -maxdepth 1 -type f | cut -d' ' -f1 | sort -u |
while read server ; do
  mkdir "$server"
done

what we do:

  • search for files (and only files -type f) in the logshare directory and not its subdirectories (-maxdepth 1)
  • server names are the first space-separated segment of the filename, so we cut that out
  • sort and keep -unique entries only
  • for each hit create a directory

Moving files accordingly

find /var/mnt/logshare -maxdepth 1 -type d | 
while read server ; do
   mv "$server?*" "$server"
done
  • searches for directories in the logshare and no deeper
  • directory names equal first part of logfile names
  • moves anythin that is called like a server, but longer to the directories
FelixJN
  • 13,566
  • I like this because it's very readable and splits the process in 2. Thank you! – Fabby Sep 15 '21 at 10:54
  • Guess what? You splitting it in 2 stages has helped me uncover some naming errors before going to stage 2 of moving all the files in the correct directory! Ping me again in a couple of days so I can give you an additional bounty! (might take a bit of time as I'm not online here all of the time) – Fabby Sep 15 '21 at 13:02
1

Try,

cd /path/to/files/ || exit
for f in *' '*; do
    mkdir -p -- "${f%% *}" &&
      mv -- "$f" "${f%% *}/"
done
pLumo
  • 22,565
  • I'm going to try this one too with set -x. – Fabby Sep 15 '21 at 10:57
  • Although I'm pretty sure your algorithm is faster than @FelixJN 's one, I'm going with his because splitting it in 2 stages has helped me uncover some naming errors! – Fabby Sep 15 '21 at 13:04