BASH script best practice

Question

I have just written a bash script and it works as I wanted. That is the script :

#!/usr/bin/env bash
DY=`date +%Y%m%d`

gunzip -c /var/log/cisco/cisco.log-$DY.gz > file.log
sleep 3
cat file.log | grep "Virtual device ath0 asks to queue packet" > file2.log
awk '{print $4}' file2.log > IP.log
sort IP.log | uniq > devices.log
wc -l devices.log
rm file.log file2.log IP.log devices.log

However, because I am new in bash I would ask if there is a better way to do that kind of script (still in bash environment). Any explanations will be very useful to improve my learning.

There's a school of thought that says if you understand it and it works, then it's good. To help guide the answers, here, what kind of "better" are you looking for? Faster? Simpler? One-liner? I can see that you're using several intermediate & temporary files; do you need them, or are you just interested in the line count of devices.log? — Jeff Schaller, Nov 04 '15 at 14:40
Hi Jeff, than you for your reply. I'm using temporary file because that is the only way that I know to do this kind of script. If there is another better way, please let me know (by the way I'm only interested in devices.log)
For other improvement I have no idea! Everything that can come from your experience, knowledge is well accepted.

How you would write that script? — Federi, Nov 04 '15 at 15:06
I have one version in an answer below; without knowing more about your environment, it's hard to say what I'd change. If you might want these device/IP counts for a different day, you could pass the date-string in as a parameter. If you needed other information from the intermediate log files, I'd save them. You might want to cache the information. And on! My main point is that you've solved the problem in a way that you understand; take heed the many suggestions below, including comments and error-checking. — Jeff Schaller, Nov 04 '15 at 15:18
I'm voting to close this question as off-topic because it belongs on Code Review... — jasonwryan, Nov 04 '15 at 15:55

score 4 · Answer 1 · edited May 23 '17 at 12:39

4

Use a commented header explaining what the script does and its usage
Use the POSIX shell (/bin/sh) for portability, often bash is not required for simple scripts
Use variables instead of hard-coded strings
Consider using the $(some_command) syntax instead of backticks
Don't cat into grep, instead use grep <pattern> <file>
Why the sleep?
Get rid of the temporary variables if you don't need the files, use pipes instead
sort | uniq can be replaced by sort -u
If you have to use temp files, consider cleaning up properly.

edited May 23 '17 at 12:39

Community

1

answered Nov 04 '15 at 14:45

Marco

33,548

To add to this, the uniq command is unnecessary; use the -u flag with sort. – chepner Nov 04 '15 at 15:04

score 2 · Answer 2 · answered Nov 04 '15 at 14:41

2

I've recently found it helpful to use Unofficial bash strict mode:

#!/bin/bash
set -euo pipefail
IFS=$'\n\t'

This set of parameters really helps to reduce surprises from unset variables, among other things.

answered Nov 04 '15 at 14:41

bash might not be located in /bin/, the OP did the right thing and used /usr/bin/env bash. That's more portable. – Marco Nov 04 '15 at 14:42
@Marco - thanks for noting that. It appears that's the doctrine, but there are at least some reasons not to do that. See all the answers in: http://stackoverflow.com/questions/21612980/why-is-usr-bin-env-bash-superior-to-bin-bash, and this question has a good dissent on that: http://unix.stackexchange.com/questions/29608/why-is-it-better-to-use-usr-bin-env-name-instead-of-path-to-name-as-my – Nov 04 '15 at 14:52
I fail to see good arguments for not using env except that it might not be located in /usr/bin/, but the some is true when hard coding the path for bash itself. – Marco Nov 04 '15 at 15:05
If you can, you should never edit IFS. It's evil and can cause unatended side effect – alexises Nov 04 '15 at 16:25
@marco There's only one reason to use #!/usr/bin/env - it saves you from the dreadful chore of editing the first line of some scripts. There are many good reasons not to use #!/usr/bin/env - makes it impossible to pass options to your interpreter, changes the process name so it can't be found with ps|grep or ps -C, it runs whichever interpreter it finds first in $PATH (which is as much a curse as a blessing), and more. – cas Nov 05 '15 at 00:12

score 2 · Accepted Answer · answered Nov 04 '15 at 14:47

2

Here is a variation of your script, as a "one-liner":

gunzip -c /var/log/cisco/cisco.log-$(date +%Y%m%d).gz | \
grep "Virtual device ath0 asks to queue packet" | \
awk '{print $4}' | sort | uniq | wc -l

It avoids creating any intermediate temporary files, which may be faster. If you had any need or use for those intermediate files, though, the one-liner is a worse direction.

One of the things I learned from reading enough well-written shell scripts was that the "grep | awk" sequence can often be combined. For your script, notice the grep command has been replaced:

gunzip -c /var/log/cisco/cisco.log-$(date +%Y%m%d).gz | \
awk '/Virtual device ath0 asks to queue packet/ { print $4 }' | \
sort | uniq | wc -l

answered Nov 04 '15 at 14:47

Jeff Schaller

67,283
35
116
255

1

The backslashes aren't necessary; a command can't end with a pipe, so the parser knows that more input is expected. – chepner Nov 04 '15 at 15:02
Good point @chepner; a one-liner might be entered interactively. If the OP wants the commands in a script, though, then the backslashes can help with readability. – Jeff Schaller Nov 04 '15 at 15:05
Fair enough; I find them "noisy" and less readable (preferring to indent the following lines slightly), but that's a certainly matter of opinion. – chepner Nov 04 '15 at 15:17
Agreed! I initially indented them and had one command per line, then changed my mind for no good reason :) Certainly reformat any code to help yourself read it. – Jeff Schaller Nov 04 '15 at 15:20
1

If all you need is a count of unique values you can modify the one-liner and have awk return the count gunzip -c /var/log/cisco/cisco.log-$(date +%Y%m%d).gz | awk '/Virtual device ath0 asks to queue packet/ !($4 in val) {val[$4]++; sum++}END{print sum}' – rcjohnson Nov 04 '15 at 19:33

BASH script best practice

3 Answers3