4

My bash script needs to read the variables from a text file that contains hundreds of variables, most of which conform to standard bash variable syntax (e.g., VAR1=VALUE1). However, a few lines in this file can be problematic, and I hope to find a simple solution to reading them into bash.

Here's what the file looks like:

#comment
VAR1=VALUE1
VAR_2=VALUE_2
...
VAR5=VALUE 5 (ASDF)
VAR6=VALUE 6 #a comment
VAR7=/path/with/slashes/and,commas
VAR8=2.1
VAR9=a name with spaces
VAR_10=true
...
VAR_N=VALUE_N

The rules about the file structure include:

  • one variable (with value) per line
  • the assignment (=) has no spaces around it
  • the variable name is in the first column (unless it is a comment line)
  • comments can follow the value (after #)
  • the values can include spaces, parens, slashes, commas, and other chars
  • the values can be floating point numbers (2.1), integers, true/false, or strings.
  • string values are not quoted, and they can be a thousand chars long or longer
  • the variable name contains only letters and underscores UPDATE: and numbers.

Most of the variables are of a type that would just allow me to source the file into my bash script. But those few problematic ones dictate a different solution. I'm not sure how to read them.

Sildoreth
  • 1,884
MountainX
  • 17,948
  • Can you just put the RHS into single quotes? – Ketan Feb 22 '14 at 02:05
  • @Ketan - how do you mean? I can't change the text file with the variables. – MountainX Feb 22 '14 at 02:15
  • You don't have to change the original text file, run it through sed and put the output in a new file and source it. – Ketan Feb 22 '14 at 02:17
  • @Ketan - that sounds like it should work... can you provide an answer with example code? The devil is often in the details, especially with bash. Thank you. – MountainX Feb 22 '14 at 02:20

2 Answers2

2

While you can transform this file to be a shell snippet, it's tricky. You need to make sure that all shell special characters are properly quoted.

The easiest way to do that is to put single quotes around the value and replace single quotes inside the value by '\''. You can then put the result into a temporary file and source that file.

script=$(mktemp)
sed <"config-file" >"$script" \
  -e '/^[A-Z_a-z][A-Z_a-z]*=/ !d' \
  -e s/\'/\'\\\\\'\'/g \
  -e s/=/=\'/ -e s/\$/\'/

I recommend doing the parsing directly in the shell instead. The complexity of the code is about the same, but there are two major benefits: you avoid the need for a temporary file, and the risk that you accidentally got the quoting wrong and end up executing a part of a line as a shell snippet (something like dangerous='$(run me)'). You also get a better chance at validating potential errors.

while IFS= read -r line; do
  line=${line%%#*}  # strip comment (if any)
  case $line in
    *=*)
      var=${line%%=*}
      case $var in
        *[!A-Z_a-z]*)
          echo "Warning: invalid variable name $var ignored" >&2
          continue;;
      esac
      if eval '[ -n "${'$var'+1}" ]'; then
        echo "Warning: variable $var already set, redefinition ignored" >&2
        continue
      fi
      line=${line#*=}
      eval $var='"$line"'
  esac
done <"config-file"
  • I'm using your recommended (2nd) method. Works great. Is there an easy way to check for a variable naming collision during this while-loop? With this approach it seems like that check might be appropriately incorporated. – MountainX Feb 23 '14 at 00:43
  • Regarding name collisions, I guess I could use eval IMPORTED_$var='"$line"' but I would rather first check for collisions and not change/prefix the variable names. – MountainX Feb 23 '14 at 00:53
  • @MountainX See my edit. – Gilles 'SO- stop being evil' Feb 23 '14 at 00:55
  • BTW, I made a mistake regarding the variable name convention. The standard letters, numbers and underscores are allowed. So the char class expression should be [!A-Z_a-z0-9]. – MountainX Feb 23 '14 at 00:57
  • @Gilles-re: name collisions. I guess the principle of your solution is that null +1 is still null. So if the result of that expression is not null, the variable name exists. Correct? – MountainX Feb 23 '14 at 01:03
  • @MountainX ${foo+1} expands to 1 if foo is set (to any value, including the empty string) and to the empty string if foo is unset. – Gilles 'SO- stop being evil' Feb 23 '14 at 01:10
1

Assuming your contents are in a file x.txt, you can use a sed expression:

sed -e 's/#.*//' -e 's/=.*$/="&"/g' -e 's/=//2' x.txt

First expression puts everything after the = sign in quotes. Second expression removes extra = sign and the third expression removes everything between # and " as they are comments.

Following will quote RHS into single quotes:

sed -e "s/#.*//" -e "s/=.*$/='&'/g" -e "s/=//2" x.txt
Ketan
  • 9,226
  • The result is VAR1"=value1" – MountainX Feb 22 '14 at 02:33
  • Yep, realized that and made an update. See if it works now. – Ketan Feb 22 '14 at 02:34
  • This doesn't handle values containing " or '. – Gilles 'SO- stop being evil' Feb 22 '14 at 23:13
  • It also handles comments incorrectly (the leading quote is kept, the trailing quote is deleted together with the #). BTW, it's probably useful to delete the comment together with any preceding whitespace. – Uwe Feb 22 '14 at 23:27
  • @Gilles It does retain single and double quotes in the RHS. @Uwe I made changes to treat comments correctly. Everything after # is now removed first. – Ketan Feb 22 '14 at 23:39
  • @Ketan So you'd be happy with parsing a file containing do_not_run_this='$(rm -rf ~)'"$(rm -rf ~)"? (DO NOT TRY THIS!) – Gilles 'SO- stop being evil' Feb 22 '14 at 23:43
  • @Gilles After putting the line you mentioned in the file along with other lines OP mentioned and running it through the sed expression I get the following for your line: do_not_run_this=''$(rm -rf ~)'"$(rm -rf ~)"'. Not sure how this is not what OP might have wanted. – Ketan Feb 22 '14 at 23:49
  • This is definitely not what the OP wanted. The values are meant to be literal strings, not shell commands. If you try to parse the output as a shell script (which is what the question was about), this runs one of the rm -rf ~ commands (either the first or the second, depending on which of your snippet he chooses). (ping @MountainX, by the way: seriously, DO NOT RUN THIS! It's a gaping security hole.) – Gilles 'SO- stop being evil' Feb 22 '14 at 23:53
  • @Gilles - thanks for all the discussion. I am reading and learning... – MountainX Feb 23 '14 at 00:25