Use a real parser like perl
's Config::Properties
module. I would do the whole script in perl
, but if you have to use bash
, you could do something like:
typeset -A props
while IFS= read -rd '' key && IFS= read -rd '' value; do
props[$key]=$value
done < <(
perl -MConfig::Properties -l0 -e '
$p = Config::Properties->new();
$p->load(STDIN);
print for $p->properties' < file.properties
)
(also works with zsh
).
Implementing a full parser in bash
would be a lot of work and mean re-inventing the wheel. You can implement a good subset with a simple while read
loop though as the read
builtin expects an input format that is very similar to those properties files:
typeset -A props
while IFS=$':= \t' read key value; do
[[ $key = [#!]* ]] || [[ $key = "" ]] || props[$key]=$value
done < file.properties
(also works with ksh93
and zsh
, the two other Bourne-like shell supporting associative arrays).
That handles:
prop = value
prop: value
prop value
- comments at the start of the line (
!
and #
with optional leading blanks)
- backslash escaping (as in
foo\:\:bar=value
for keys containing delimiters or foo=\ bar
or the password_with\\backslash-and=equals
in your sample).
- line continuation with backslash
However, if we check against the specification
That doesn't handle \n
, \r
, \uXXXX
... sequences
LF is the only recognised line delimiter (not CR nor CRLF).
FF is not recognised as a whitespace (we can't just add it to $IFS
as depending on the shell and version, \f
will not necessarily be recognised as an IFS-whitespace character¹).
for an input like foo: bar =
, that stores bar
in ${props[foo]}
instead of bar =
(foo: bar:baz:
is OK though). That's only a problem when the value of the property contains one (unescaped) delimiter (:
optionally surrounded by SPC/TAB characters, =
optionally surrounded by SPC/TAB characters or sequence of one or more SPC/TAB characters) and it is at the end.
it treats as comments lines that start with \!
or \#
. Only a problem for properties whose name starts with !
or #
.
in
prop=1\
2\
3
we get 1 2 3
instead of 123
: the leading spaces are not ignored in the continuation lines as they should be.
² IFS whitespace characters, per POSIX are the characters classified as [:space:]
in the locale (which generally includes \f
but doesn't have to) and that happen to be in $IFS
though in ksh88 (on which the POSIX specification is based) and in most shells, that's still limited to SPC, TAB and NL. The only POSIX compliant shell in that regard I found was yash
. ksh93
and bash
(since 5.0) also include other whitespace (such as CR, FF, VT...), but limited to the single-byte ones (beware on some systems like Solaris, that includes the non-breaking-space which is single byte in some locales)
augtool
to retrieve whatever values you're interested in was another possible. IMHO putting some sort of third party parsing logic into your script is usually better than cobbling together yourself. You might be able to find something that seems to work for the data you've seen so far but it's generally safer to use logic someone else has vetted. You'd definitely want to limit the lenses you load if you were going to useaugtool
in a bash script or something. – Bratchley Jul 18 '16 at 15:48