Your criteria seems to be: match any number of uppercase letters or underscores contained between ${
and }
.
gawk
can work for this on it's own, or grep
can simplify the pattern matching part (but will need extra formatting afterwards).
GNU awk
:
gawk -v 'RS=[$]{' -F '}' '$1 ~ /^[A-Z_]+$/ && !a[$1]++ {printf "%s=\n", $1}' FILE
- GNU awk can accept a regex for the record separator, so by assigning
RS=[$]{
, it will split the input FILE
up into records wherever the pattern ${
appears
- field separator set to
}
– now the first field of each record can be checked to see if it matches your other criteria: nothing other than one-or-more of [A-Z_]
- using
&& !a[$1]++
will remove duplicates
- the print statement adds an equals sign
=
to the end of each line – to match your desired output
- also note: the first part of a file will always be counted as the first record – even if it didn't begin with
${
– this means that if your file began with [A-Z_]+}
(unlikely) – those uppercase letters/underscores would "match" and be printed on the first line of output
grep
+ formatting
grep
is perhaps easier to understand (thanks to it's -o
/ --only-matching
option):
grep -o '${[A-Z_]\+}' FILE
- but this doesn't format the output: a pipe through
sed
could do that: eg.
grep -o '${[A-Z_]\+}' FILE | sed 's/${\(.*\)}/\1=/'
- this doesn't remove duplicates: pipe output through
sort -u
to do that, or alternatively pipe once through awk:
grep -o '${[A-Z_]\+}' FILE | awk -F '[{}]' '!a[$0]++{printf "%s=\n", $2}'