Extract marked strings from text file using Bash

Question

I have files which are of the following style - these are parametrised configuration files; the values within the #characters are replaced with real values from a database depending on environment.

ABC=#PARAMETER_1#:#PARAMETER_2#
SOMETHING_ELSE=#PARAMETER_1#
SOMETHING_NEW=#PARAMETER_2##PARAMETER_3#

I would like to extract from these files the values between the hash/pound (#) characters, so that I can easily identify the parameters required. There is no standard column width or anything like that, the only standard being that anything between two # characters is replaced with a value from the database.

This is the ideal cleaned, deduped output:

PARAMETER_1
PARAMETER_2
PARAMETER_3

I have seen this question, but the crucial difference is that there can be any number of variables on a particular line in my situation.

I have tagged this question with Bash, but it doesn't have to be, it could be perl etc, it just needs to run from the command line in Unix.

manatwork · Accepted Answer · 2012-06-11T12:07:46.003

5

As a first idea, awk:

awk -vRS='#[^#]+#' 'RT{gsub(/#/,"",RT);p[RT]=1}END{for(i in p)print i}' the_file

But this decision may depend on the other operations you have to perform.

Explanations as requested in comment.

awk -vRS='#[^#]+#' '   # use /#[^#]+#/ as record separator
RT {   # record terminator not empty?
  gsub(/#/,"",RT)    # remove the # parameter delimiter markup
  p[RT]=1   # store it as key in array p
}
END {   # end of input?
  for (i in p) print i   # loop through array p and print each key
}' the_file

The essential part is the use of RT (record terminator) built-in variable:

   RT          The record terminator.  Gawk sets RT to the input text that
               matched the character or regular expression specified by
               RS.

edited Jun 11 '12 at 12:07

answered Jun 11 '12 at 09:44

manatwork

31,277

That's amazing! Thanks - any links that explain how it works? – Rich Jun 11 '12 at 11:22
1

The only awk documentation I used to read is The GNU Awk User's Guide. – manatwork Jun 11 '12 at 12:08
1

Just by the way, just p[RT] is quite enough to create the list of keys, ie, it doesn't require =1, as it never needs to access any values (it's actually faster, too) ... and also by the way, very nice +1. – Peter.O Jun 11 '12 at 20:33
@Peter.O, you are right. Assigning a value there was just by reflex. – manatwork Jun 12 '12 at 05:54

Extract marked strings from text file using Bash

1 Answers1