-2

I have a script with a function which is sourced in other scripts. I am trying to go line by line but the sed regex is too complicated.

#!/usr/bin/env bash

This function will update the value associated with a key,

remove a comment from the beginning of the line,

or append the key value pair to the end of the file if the key is not found

To use this function in a script

source this script:

. lineinfile

To invoke the function:

lineinfile "key=value" "filename"

OR

lineinfile "key value" "filename"

lineinfile() { [ -s $2 ] || echo "${1}" >> ${2} if [[ "$1" == "=" ]]; then sed -i -e "/^#?.(${1%%=})./{s@@${1}@;:a;n;ba;q}" -e "$a${1}" ${2} elif [[ "$1" == " "* ]]; then sed -i -e "/^#?.(${1%% })./{s@@${1}@;:a;n;ba;q}" -e "$a${1}" ${2} elif [[ "$1" == $'\t\t'* ]]; then sed -i -e "/^#?.(${1%%$'\t\t'}).*/{s@@${1}@;:a;n;ba;q}" -e "$a${1}" ${2} fi }

The first line of the function [ -s $2 ] || echo "${1}" >> ${2} - Checks if the second positional argument is a file that exists and has a non-zero size, then append the contents of $1 to end of $2 file. Why is || used here?

I am really not sure what the if-elif blocks are testing for. What are*"="* *" "* and *$'\t\t'* trying to match in the if conditions? Additionally, I have no idea what the sed commands are doing. The regex is complicated. Can anyone breakdown the sed commands for me.

Cruise5
  • 496
  • The double quotes ask the shell to preprocess the contents, the sed commands. ${1%%=*} is a shell parameter expansion stripping everything after the first equal sign in argument 1's value so if $1 is key=value ${1%%=*} is key. Try running this with set -x in effect to see the actual commands being constructed/issued. – jthill Jun 04 '22 at 17:53
  • What's the use of || in the first line? Also what's the use of if block after the first line if the echo "${1}" >> ${2} executes before if block. – Cruise5 Jun 05 '22 at 02:20
  • 1
    Spend some quality time with man bash and man sed and info sed. You're asking very, very basic questions about fundamental syntax, asking and answering such questions one man page sentence at a time is a wasteful use of everyone's time. – jthill Jun 05 '22 at 02:34

1 Answers1

0
  1. The || runs the following command if the exit code of the previous command was non-zero (false/error).

    [ -s $2 ] || echo "${1}" >> ${2}
    

    is equivalent to:

    if ! [ -s $2 ] ; then echo "${1}" >> ${2} ; fi
    

    Either of these will append the first arg ($1) to the file ($2) if the file either doesn't exist or is empty. BTW, see the next point, 2, below about quoting.

    BTW, the function could (and should) return at this point. There's no need to run sed on the file as it now contains the desired value. For example (and using printf instead of echo - see Why is printf better than echo?):

    [ -s "$2" ] || printf '%s\n' "$1" >> "$2" && return
    

    or, better:

    if ! [ -s "$2" ] ; then printf '%s\n' "$1" >> "$2" ; return ; fi
    

    With the first form, the return only executes if the previous command (printf) succeeded. The second form always returns, whether printf succeeded or not. There's no good reason to make the return dependent on the printf succeeding (it's just a common idiom for "chaining" commands in this kind of short-hand if/then/fi construct). Most of the time, the printf will succeed but sometimes (e.g. permissions or disk full) it will fail. If it fails, the function should return anyway - the sed scripts will also fail so there's no point running them. BTW, return without an argument will return the exit code of the last command to be executed, so the caller will be able to detect success or failure.

  2. The author doesn't seem to understand what quoting is for in shell, or how it works, or that curly-braces, e.g. ${var}, are NOT a substitute for quoting, e.g. "$var". See $VAR vs ${VAR} and to quote or not to quote and Why does my shell script choke on whitespace or other special characters?.

    The author consistently fails to quote $2 when it should be quoted (filenames can contain spaces, tabs, newlines, and shell metacharacters that can break a shell script if used unquoted).

  3. The three if/elif tests check whether the first argument ($1) contains an = symbol, a space, or two tabs. It runs one of three slightly different versions of a sed script depending on which one it finds.

    The sed scripts all check if their variant of "key" followed by either =, a space, or two tabs is in the file, optionally commented out with a #. If a match is found, it replaces it with the value of $1 and runs a loop to just read and output the rest of the file. I think the purpose here is to replace only the first occurrence of key=value.

    If the match wasn't found (and thus the loop and quit wasn't executed), it appends $1 to the end of the file.

  4. The author seems to be overthinking this (or perhaps underthinking it). This could be done with just one sed script if $1 were first split into key and value variables. i.e. first extract and "normalise" the data from $1 into a consistent form, then use it in just one sed script.

    Or just rewrite the whole thing in perl, it's a good choice when you want to do something that requires the strengths of both shell and sed (and has a far better and more capable regex engine than most versions of sed). e.g. replace the if/elif/elif/fi part of the the function with something like:

     perl -0777 -i -pe '
          BEGIN { $r = shift; ($key,$val) = split /(=| |\t\t)/, $r };
          s/\z/$r\n/ unless (s/^#?.*\b$key\b.*$/$r/m)' key=value filename
    

    This perl version works with all three variants (=, space, two tabs - the latter two need to be quoted). It slurps the entire file in at once (-0777 option), and tries to do a multi-line search and replace operation (/m regex modifier). If that operation fails, it appends the first arg (plus a newline) to the end of the file (\z). It also fixes a bug in the original, which fails to distinguish between, e.g., foo=123 and foobar=123. The \b word-boundary markers are used to do this. In sed, you'd use \< and \> to surround the key pattern.

    BTW, the X unless Y construct is just perl syntactic sugar for if not Y, then do X. It could have been written as if (! s/^#?.*$key.*$/$r/m) {s/\z/$r\n/} and would still work exactly the same.

  5. The function name lineinfile is very generic, but what it does is very specific. Worse, the name doesn't match or even hint at what the function actually does. This is generally considered to be bad practice.

cas
  • 78,579
  • Thanks for the detailed breakdown. What do /{s@@${1}@;:a;n;ba;q} mean in sed -i -e "/^#\?.*\(${1%%=*}\).*/{s@@${1}@;:a;n;ba;q}" -e "\$a${1}" ${2} ? I couldn't find what a;n;ba;q mean anywhere in sed cheatsheets. Any sed resource that I can lookup for these? – Cruise5 Jun 18 '22 at 01:54
  • man sed (or info sed if you are running GNU sed and have the info docs for sed installed - otherwise you can find the manual for GNU sed at https://www.gnu.org/software/sed/manual/sed.html). s@@${1}@ replaces the previous match (/^#\?..../) with the value of $1, the bash script's first arg - it's using @ as the delimiter instead of /. :a sets a label "a", n reads the next line into the pattern space; ba branches (jumps aka "goto") to "a"; q quits sed (which is redundant at the end of a script). all together, they loop reading and printing the remaining lines in the file. – cas Jun 18 '22 at 02:09
  • personally, I don't think it's worth the effort to learn more than the most basic of sed operations - IMO anything requiring looping or branching or manipulating hold space is better done with perl or awk, and both of those are far easier to learn and more generally useful. Well....it's worth knowing how it works just so you can understand other people's complicated sed code but you're better off writing your own scripts in awk or perl. – cas Jun 18 '22 at 02:27