124

Is there an easy way to substitute/evaluate environment variables in a file?

Like let's say I have a file config.xml that contains:

<property>
    <name>instanceId</name>
    <value>$INSTANCE_ID</value>
</property>
<property>
    <name>rootPath</name>
    <value>/services/$SERVICE_NAME</value>
</property>

...etc.

I want to replace $INSTANCE_ID in the file with the value of the INSTANCE_ID environment variable, $SERVICE_NAME with the value of the SERVICE_NAME env var.

I won't know a priori which environment vars are needed (or rather, I don't want to have to update the script if someone adds a new environment variable to the config file).

AdminBee
  • 22,803
  • 1
    When will you do something with file (cat, echo, source,…) the variable will subtitute by its value – Costas Jul 09 '16 at 14:30
  • 1
    Is the contents of this xml file up to you? If so, parameterized xslt offers another way to inject values and (unlike envsubst and its ilk) guarantees well formed xml as a result. – kojiro Jul 09 '16 at 17:28

13 Answers13

169

You could use envsubst (part of gnu gettext):

envsubst < infile

will replace the environment variables in your file with their corresponding value. The variable names must consist solely of alphanumeric or underscore ASCII characters, not start with a digit and be nonempty; otherwise such a variable reference is ignored.

Some alternatives to gettext envsubst that support ${VAR:-default} and extra features:
rust alternative
go alternative
node.js alternative


To replace only certain environment variables, see this question.

don_crissti
  • 82,805
  • 4
    ...except it's not installed by default in my docker image :'-( – Robert Fraser Jul 09 '16 at 14:57
  • 8
    That's good. Docker images should be lightweight and tailor made. Of course you could always add envsubst to it, though. – kojiro Jul 09 '16 at 17:26
  • Or go full container on it and put envsubst in a container all by itself. It's a common pattern and a way of life if you make use of an OS like Atomic Host, CoreOS, or RancherOS. Atomic specifically won't even let root mess with the file system or what is installed you have to use a container. – Kuberchaun Aug 22 '16 at 01:26
  • 2
    Note that it won't replace "all" environment variables, only those whose name matches ^[[:alpha:]_][[:alnum:]_]*$ in the POSIX locale. – Stéphane Chazelas Jul 27 '17 at 12:26
  • Seems to be very succinct, however not necessarily correct with all substitution values. It does not seem to respect XML special characters. – EFraim Dec 19 '17 at 10:00
  • 1
    @EFraim The answer and the comments all say so. Only alphanumeric characters. – oligofren Sep 25 '20 at 12:00
54

This is not very nice but it works

( echo "cat <<EOF" ; cat config.xml ; echo EOF ) | sh

If it was in a shell script it would look like:

#! /bin/sh
cat <<EOF
<property>
    <name>instanceId</name>
    <value>$INSTANCE_ID</value>
</property>
EOF

Edit, second proposal:

eval "echo \"$(cat config.xml)\""

Edit, not strictly related to question, but in case of variables read from file:

(. .env && eval "echo \"$(cat config.xml)\"")
Paolo
  • 17,355
hschou
  • 2,910
  • 13
  • 15
  • The problem with this is that if the file contains a line with EOF, the remaining lines will be executed as commands by the shell. We could change the separator to something longer or more complicated, but there's still a theoretical possibility of colliding. And someone could deliberately make a file with the separator to execute commands. – ilkkachu Jul 09 '16 at 14:32
  • 1
    OK, try this: eval "echo "$(cat config.xml)"" – hschou Jul 09 '16 at 14:34
  • 3
    Try putting something like "; ls ;" inside the file and do that eval command again :) This is pretty much the same problem as with SQL injection attacks. You have to be really careful when mixing data with code (and that's what shell commands are), unless you're really, really sure nobody is trying to do anything to mess up your day. – ilkkachu Jul 09 '16 at 14:43
  • No. "; ls ;" won't do any harm. – hschou Jul 09 '16 at 16:32
  • 3
    @hschou I think ilkkachu meant \"; ls ;"`` — the comment formatting ate the backticks. But actually that shoule be just \ls`` here. The point is that the content of the file leads to arbitrary code execution and there's nothing you can do about it. – Gilles 'SO- stop being evil' Jul 09 '16 at 22:44
  • I did actually mean just embedding the command in double-quotes and semicolons to specifically work around the quoting in the eval "echo.." command. (Tried it, works for me) But I'll admit command substitution is even more evil and simple! – ilkkachu Jul 09 '16 at 22:59
  • 1
    If ever possible everyone should go for envsubst. If you like me work on OSF1, Aix and something worse, you can go with the eval "echo.. despite that $(ls) in config.xml could do harm. About the harm, do you trust to run ./configure as root? – hschou Jul 10 '16 at 20:17
15

If you happen to have Perl (but not gettext and envsubst) you can do the simple replacement with a short script:

$ export INSTANCE_ID=foo; export SERVICE_NAME=bar;
$ perl -pe 's/\$([_A-Z]+)/$ENV{$1}/g'  < config.xml
<property>
    <name>instanceId</name>
    <value>foo</value>
</property>
<property>
    <name>rootPath</name>
    <value>/services/bar</value>
</property>

I assumed the variable names will only have uppercase letters and underscores, but the first pattern should be easy to alter as needed. $ENV{...} references the environment Perl sees.

If you want to support the ${...} syntax or throw an error on unset variables, you'll need some more work. A close equivalent of gettext's envsubst would be:

perl -pe 's/\$(\{)?([a-zA-Z_]\w*)(?(1)\})/$ENV{$2}/g'

Though I feel that feeding variables like that via the process environment seems a bit iffy in general: you can't use arbitrary variables in the files (since they may have special meanings), and some of the values could possibly have at least semi-sensitive data in them.

ilkkachu
  • 138,973
2

The command below replaces all occurrences of environment variables of the form $VAR in a file.

compgen -e | xargs -I @ sh -c 'printf "s|\$%q\>|%q|g\n" "@" "$@"' | sed -f /dev/stdin input.file > output.file

Note: sed syntax is slightly different on macOS: [[:>:]] to be used instead of \>.

Here's how it works:

  • compgen -e lists environment variable names without values, for example:
    HOME
    LANG
    PATH
    PWD
    ...
    
  • xargs redirects the output to the shell that renders the sed substitute commands via printf populating the environment variable values along the way:
    s|$HOME\>|/Users/sshymko|g
    s|$LANG\>|en_US.UTF-8|g
    s|$PATH\>|/usr/local/bin:/usr/bin:/bin:/usr/sbin:/sbin|g
    s|$PWD\>|/Users/sshymko/Documents|g
    ...
    
  • sed modifies contents from a file using the substitute commands in the standard input

The substitute command format is s|\$%q\>|%q|g\n, where:

  • | — delimiter guaranteed to be escaped in environment variable name/value
  • %q — string placeholder of printf escaping special shell characters
  • \> — end of a word boundary to avoid a variable name prefix match
  • g — global replace, i.e. replace all occurrences
  • \n — line break separating sed substitute commands

Additionally:

  • @ — argument placeholder of xargs holding the environment variable name
  • $@ — dereferenced value of the environment variable
  • If anyone knows of a way to dereference environment variable name to its value without invoking the shell sh -c, that would be greatly appreciated. Experimented with printf without any success. – Sergii Shymko May 12 '20 at 05:52
  • (1) Rather than starting with just the names of the environment variables, and then trying to get the corresponding values, it might be easier to run env and get the names and the values together.  Warning: There are probably security implications to this.  (2) If I have HOME=/home/scott, then your script will turn $HOMEDRIVE into /home/scottDRIVE.  It may be possible to work around this by handling the variables in reverse order; e.g., ZEBRA first, USERNAME before USER, MONKEY in the middle, HOMEDRIVE before HOME, and AARDVARK at the end. … (Cont’d) – Scott - Слава Україні May 12 '20 at 06:50
  • (Cont’d) … (3) Your answer will fail to handle things like ${HOME}.  (4) Your answer will totally blow up if any environment variables contain = in their value.  (OK, I see that you acknowledged that as a constraint.)  (5) You can change -f /dev/stdin to -f -. … … … … … … … … … … … … … … … … … … … +1 for an innovative approach and a good explanation. – Scott - Слава Україні May 12 '20 at 06:50
  • @Scott Excellent review! Agreed on all points. Updated the answer to address most of the concerns. Thanks! (1) Starting from var names avoids parsing challenges, such as multiple =, line breaks, etc (2) Fixed by reverse sorting (3) Wasn't intending to handle ${VAR} format for simplicity (4) Implemented escaping of special characters; changed delimiter to | (5) Unfortunately, it doesn't work on macOS – Sergii Shymko May 12 '20 at 18:33
  • Good job. I believe that the \> fixes the prefix problem (HOMEHOMEDRIVE), so you don’t need the sort also.  Here’s another edge case: your answer will change \$LANG to \en_US.UTF-8 — but the question doesn’t specify how such input should be handled. – Scott - Слава Україні May 14 '20 at 03:17
2

Continuing with the Perl and PHP examples, here is a Python oneliner:

python -c 'import os,sys; sys.stdout.write(os.path.expandvars(sys.stdin.read()))' < in.file > out.file

Cause not every environment has envsubst installed, unfortunately.

2

With zsh:

zmodload zsh/mapfile
set -o extendedglob
mapfile[config.xml]=${mapfile[config.xml]//(#m)\$[[:IDENT:]]##/${(e)MATCH}}

To also cover ${var}:

mapfile[config.xml]=${mapfile[config.xml]//(#m)\$([[:IDENT:]]##|{[[:IDENT:]]##})/${(e)MATCH}}

To cover all expansions, including `cmd`, $(( 1+1 )), $(cmd), $var[x], ${var#???}...:

mapfile[config.xml]=${(e)mapfile[config.xml]}
1

May I suggest my own script for this?

https://github.com/rydnr/set-square/blob/master/.templates/common-files/process-file.sh

#!/bin/bash /usr/local/bin/dry-wit
# Copyright 2016-today Automated Computing Machinery S.L.
# Distributed under the terms of the GNU General Public License v3

function usage() {
cat <<EOF
$SCRIPT_NAME -o|--output output input
$SCRIPT_NAME [-h|--help]
(c) 2016-today Automated Computing Machinery S.L.
    Distributed under the terms of the GNU General Public License v3

Processes a file, replacing any placeholders with the contents of the
environment variables, and stores the result in the specified output file.

Where:
    * input: the input file.
    * output: the output file.
Common flags:
    * -h | --help: Display this message.
    * -v: Increase the verbosity.
    * -vv: Increase the verbosity further.
    * -q | --quiet: Be silent.
EOF
}

# Requirements
function checkRequirements() {
  checkReq envsubst ENVSUBST_NOT_INSTALLED;
}

# Error messages
function defineErrors() {
  export INVALID_OPTION="Unrecognized option";
  export ENVSUBST_NOT_INSTALLED="envsubst is not installed";
  export NO_INPUT_FILE_SPECIFIED="The input file is mandatory";
  export NO_OUTPUT_FILE_SPECIFIED="The output file is mandatory";

  ERROR_MESSAGES=(\
    INVALID_OPTION \
    ENVSUBST_NOT_INSTALLED \
    NO_INPUT_FILE_SPECIFIED \
    NO_OUTPUT_FILE_SPECIFIED \
  );

  export ERROR_MESSAGES;
}

## Parses the input
## dry-wit hook
function parseInput() {

  local _flags=$(extractFlags $@);
  local _flagCount;
  local _currentCount;

  # Flags
  for _flag in ${_flags}; do
    _flagCount=$((_flagCount+1));
    case ${_flag} in
      -h | --help | -v | -vv | -q)
         shift;
         ;;
      -o | --output)
         shift;
         OUTPUT_FILE="${1}";
         shift;
         ;;
    esac
  done

  # Parameters
  if [[ -z ${INPUT_FILE} ]]; then
    INPUT_FILE="$1";
    shift;
  fi
}

## Checking input
## dry-wit hook
function checkInput() {

  local _flags=$(extractFlags $@);
  local _flagCount;
  local _currentCount;
  logDebug -n "Checking input";

  # Flags
  for _flag in ${_flags}; do
    _flagCount=$((_flagCount+1));
    case ${_flag} in
      -h | --help | -v | -vv | -q | --quiet)
         ;;
      -o | --output)
         ;;
      *) logDebugResult FAILURE "fail";
         exitWithErrorCode INVALID_OPTION ${_flag};
         ;;
    esac
  done

  if [[ -z ${INPUT_FILE} ]]; then
    logDebugResult FAILURE "fail";
    exitWithErrorCode NO_INPUT_FILE_SPECIFIED;
  fi

  if [[ -z ${OUTPUT_FILE} ]]; then
      logDebugResult FAILURE "fail";
      exitWithErrorCode NO_OUTPUT_FILE_SPECIFIED;
  fi
}

## Replaces any placeholders in given file.
## -> 1: The file to process.
## -> 2: The output file.
## <- 0 if the file is processed, 1 otherwise.
## <- RESULT: the path of the processed file.
function replace_placeholders() {
  local _file="${1}";
  local _output="${2}";
  local _rescode;
  local _env="$(IFS=" \t" env | awk -F'=' '{printf("%s=\"%s\" ", $1, $2);}')";
  local _envsubstDecl=$(echo -n "'"; IFS=" \t" env | cut -d'=' -f 1 | awk '{printf("${%s} ", $0);}'; echo -n "'";);

  echo "${_env} envsubst ${_envsubstDecl} < ${_file} > ${_output}" | sh;
  _rescode=$?;
  export RESULT="${_output}";
  return ${_rescode};
}

## Main logic
## dry-wit hook
function main() {
  replace_placeholders "${INPUT_FILE}" "${OUTPUT_FILE}";
}
# vim: syntax=sh ts=2 sw=2 sts=4 sr noet
Zanna
  • 3,571
1

Use cmake's configure_file function.

It copies an <input> file to an <output> file and substitutes variable values referenced as @VAR@ or ${VAR} in the input file content. Each variable reference will be replaced with the current value of the variable, or the empty string if the variable is not defined.

AdminBee
  • 22,803
tml
  • 11
1

To add to the discussion about Docker images above (unfortunately couldn't comment there).

As the safest methods are to use either envsubst or perl. When used in Docker, the problem is that the latter is a part of Debian-based images (including slim-type) but not Alpine. Adding envsubst to an image directly, will result in increasing the size for 25MB or 65MB in Alpine or Debian-based images correspondingly.

As the result, I ended up with selection of either one in docker-entrypoint directly. A caveat: in Alpine-based images, it'll download envsubst directly from Github. You might want it to have it somewhere locally to speed up the process. Or bake-in during the build stage (just 2.5Mb that way).

        # do not add `envsubst` to the image! It saves 25/65M in Docker Alpine/Debian-based images correspondingly
        # `perl` is part of Debian but not Alpine-based images
        if [[ -n $(command -v perl) ]] ; then
            perl -pe 's/\$(\w+)/$ENV{$1}/g' <"${file}" >"/tmp/${filename}"
        else
            if [[ ! -f /tmp/envsubst ]] ; then
                wget -O /tmp/envsubst "https://github.com/a8m/envsubst/releases/latest/download/envsubst-$(uname -s)-$(uname -m)"
                chmod 500 /tmp/envsubst
            fi
            /tmp/envsubst -i "${file}" -o "/tmp/${filename}"
        fi
...
[[ -f /tmp/envsubst ]] && rm -f /tmp/envsubst
sfuerte
  • 111
  • 25/65 delta is negligible, unless you are running these at a scale that would preclude performing these conditional checks, in shell, in the first place. For everyone else, apk add gettext is far more durable approach than the above snippet, but I feel you on wanting to that fine gained control on the resulting artifact. I used to do the same, but it was beaten out me by literally every lead/manager ive had :) Cheers – christian elsee Nov 03 '21 at 13:20
0

Similarly to the Perl answer, environment variable substitution can be delegated to the PHP CLI. Dependency on PHP may or may not be acceptable depending on the tech stack in use.

php -r 'echo preg_replace_callback("/\\$([a-z0-9_]+)/i", function ($matches) { return getenv($matches[1]); }, fread(STDIN, 8192));' < input.file > output.file

You can go further and put it in a reusable script, for example, envsubst:

#!/usr/bin/env php
<?php

echo preg_replace_callback(
    '/\$(?<name>[a-z0-9_]+)/i',
    function ($matches) {
        return getenv($matches['name']);
    },
    file_get_contents('php://stdin')
);

The usage would be:

envsubst < input.file > output.file
0

Building upon @don_crissti's answer, which is definitely correct, but it doesn't address a last mile concern in the question, which is:

I want to replace $INSTANCE_ID in the file with the value of the INSTANCE_ID

Interpolating a file, "in-place", using envsubst, is actually trickier than it would appear, given that read and write operations are (mostly) done concurrently, and are often parallel, when more than one cpu is available.

Eenvsubst reads against STDIN, and writes against STDOUT, we can't simply read, interpolate and redirect stdout to the same template file:

/ # </tmp/tmpl cat
hello $PATH
/ # </tmp/tmpl envsubst
hello /usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin
/ # </tmp/tmpl envsubst >/tmp/tmpl 
/ # </tmp/tmpl cat
/ # stat /tmp/tmpl 
  File: /tmp/tmpl
  Size: 0           Blocks: 0          IO Block: 4096   regular empty file

There are a lot of ways to approach this, but the all boil down to: write the interpolated result to a buffer or write to a different file. Because your example looks config and deploy related, I prefer to use a makeesque build and install workflow, where we write the interpolated results to a dist or build directory, and then work against that.

# copy configuration/manifest/etc to dist directory, and then interpolate tmpl; not tested at all 
$ cp -rf path/to/configs ./dist
$ find -L path/to/configs -type f -print0 \
    | xargs -0 -n1 -- sh -c '<"$$1" envsubst >"dist/$$1"' _ 

Cheers from Linkoping ;)

0

Another way without envsubst could be:

printenv | sed 's|=.*||' | while read envvarname; do sed -i "s|\${$envvarname}|${!envvarname}|" /file/to/replace.txt; done

Will replace the string ${MYVAR} with the content of MYVAR in the file /file/to/replace.txt.

KeKru
  • 101
0

When using envsubst and the environment variable does not exist, envsubst puts an empty string to the output which may be a problem. The below script checks that all variables exist before running envsubst.

FILE_TEMP="file_with_env_declarations.txt"

missing_vars=false while read line; do in_env=$(env | grep $line) [ "${in_env}" = "" ] && echo "ERROR: The variable '$line' does not exist!" && missing_vars=true done < <(cat ${FILE_TEMP} | egrep -o "\${[A-Za-z0-9_]+}|\$[A-Za-z0-9_]+" | tr -d '${}') [ "${missing_vars}" = true ] && echo "ERROR: There are variables in '${FILE_TEMP}' missing in the environment!" && exit 1

envsubst < ${FILE_TEMP}