1

With bash, I'm running this: declare -p | grep 'declare -- ' That prints whole lines. I want to print those same lines, but I wanted to exclude the match itself. i.e., Can I do ... | grep pattern | sed 's/pattern//' as a single command? It would be like the opposite of the -o option.

My command outputs this:

...
declare -- MAX_CONNECTIONS_PER_SERVER="6"
declare -- OPTERR="1"
declare -- OSTYPE="linux-gnu"
...

But I want to output this:

...
MAX_CONNECTIONS_PER_SERVER="6"
OPTERR="1"
OSTYPE="linux-gnu"
...

Normally, I would just pipe this to sed, but coincidentally, I've wanted to do this twice today. I looked at the man page, but I didn't see any option that could do this.

Return only the portion of a line after a matching pattern is a very similar question. Perhaps it's even a duplicate. One could argue mine is a little bit more narrow: it's guaranteed that the pattern I'm grepping and the pattern I'm removing will be the same. I want to remove x. The question wants to remove .*x.

5 Answers5

3

That should just be:

declare -p | sed -n 's/^declare -- //p'

But that kind of approach at parsing the output of declare -p is flawed.

What if there's a VAR=$'\ndeclare -- reboot #' in the environment for instance, and you feed that output to a shell for interpretation?

Also note, that for variables that are declared but not assigned any value, bash prints: declare -- VARNAME, so after declare reboot, the above code would output reboot as well.

You could change it to:

declare -p | LC_ALL=C sed -n 's/^declare -- \(.*=\)/\1/p'

To restrict to assigned variables, but that would still not address problems with variables that contain newlines (which is common even if only for the $TERMCAP variable).

Instead you could use this kind of hack:

(
  export LC_ALL=C
  eval '
    declare()
      if [[ $1 = -- ]]; then
        printf "%s=%q\n" "${2%%=*}" "${2#*=}"
      fi'"
    $(declare -p | sed 's/[][()]/\\&/g')"
)

Where we evaluate the output of declare -p where we escaped (, ), [ and ] used in array and associative array representations, after having redefined declare as a function that prints its second argument if the first is --. Use at your own risk, the output of bash's declare -p has been known to be unsafe for evaluating back in the past, it's also possible that adding that escaping with sed adds complications, especially with sed implementations that have a limit on length of the lines they accept.


in zsh, you could do:

typeset ${(k)parameters[(R)scalar]}

To print the definitions of variables that are scalar and don't have any attribute (not even special nor tied) as seems to be your intention.

Note however that it doesn't work if called within a function (in which case it would declare all those variables as local instead) or if the typesetsilent option is enabled.

Another approach which would work around that and give you more control over how the value is quoted would be:

() for 1 do print -r -- $1=${(Pq+)1}; done ${(k)parameters[(R)scalar]}

q+ gives a similar quoting style as that used by typeset. Use qq for a safer quoting style if you intend to use that output in a shell (that may not be zsh, or not from the same version, or not in the same locale or OS/libc).

  • What if there's a VAR=$'\ndeclare -- reboot; #' Seems like this would be some type of security injection, but I don't think I have enough knowledge to understand what would execute that reboot command. – Daniel Kaplan Mar 25 '22 at 08:21
  • 1
    @DanielKaplan, sed or grep -P on the output of declare -p in that environment would output a reboot #" line, which if fed to a shell (which may be what you ultimately want to do with those variable assignments) would cause a reboot. – Stéphane Chazelas Mar 25 '22 at 08:24
  • Ahh. In that case I had enough knowledge. I thought that alone would've rebooted my machine. – Daniel Kaplan Mar 25 '22 at 08:25
  • Somewhat annoyingly, Bash prints array members quoted, e.g. a=($'new\nline'); declare -p a gives declare -a a=([0]=$'new\nline'). If it only did that to scalars too, parsing the output would be at least somewhat safer. – ilkkachu Mar 30 '22 at 10:20
  • @ilkkachu I feel like Alice (through the looking glass) — I’m seeing essentially the opposite in Bash 4.1.17.  After aaa1=$'abc\ndef'; aaa2=($'ghi\njkl'), if I do declare | grep -A1 aaa or set | grep -A1 aaa, I get the scalar quoted: aaa1=$'abc\ndef'aaa2=([0]="ghijkl").  Meanwhile, declare -p | grep -A1 aaa (or declare -p aaa1 aaa2) gives raw newlines (i.e., multi-line output) for both: declare -- aaa1="abcdef"declare -a aaa2='([0]="ghijkl")'. – G-Man Says 'Reinstate Monica' Apr 01 '22 at 01:54
  • @G-ManSays'ReinstateMonica', ok, so it's also different wrt. -p vs no option too, sigh. I get the array members quoted in newer versions anyway. – ilkkachu Apr 01 '22 at 12:54
2

If available, use grep -P:

declare -p | grep -Po 'declare -- \K.*'

Note, that your approach will generally not work well , because variables can contain newlines that you will cut off with grep and get syntax errors.

See e.g.:

declare -- IFS="    
"
pLumo
  • 22,565
  • FYI, that works on bash but not zsh. – Daniel Kaplan Mar 25 '22 at 06:50
  • I guess your zsh does not have pgrep, it should work the same. – pLumo Mar 25 '22 at 06:53
  • Hrm... it does have pgrep. I think the reason it doesn't work is that declare -p has different output from bash's. In hindsight, I shouldn't have even made my comment, as technically, declare -p was only used as an example input. – Daniel Kaplan Mar 29 '22 at 09:37
1

A. sed and grep

The shell displays the values of some variables on several lines because they contain embedded newlines.

grep and sed are designed to search patterns on the same line, line by line (the newline character is used as a hard-coded delimiter).


B. Using awk

Awk can select a pattern on a line but also apply rules conditionally.

1. Select the matching lines

  • Display lines starting with the shell words declare --
/^declare --/
  • Display lines that don't begin with the word declare
!/^declare/

The two preceding rules allow to 1. display the variables that don't have attributes and 2. the values displayed on subsequent lines for multi-line values.

We can use a sample input to show an overview of the pattern matching.

declare -- HOSTNAME="retro-
host"
declare -a GROUPS=()
declare -x GREETINGS="Hello
World!"
declare -i HISTCMD
declare -- PROMPT_COMMAND="printf \"\\033]0;%s@%s:%s\\007\" \"\${USER}\" \"\${HOSTNAME%%.*}\" \"\${PWD/#\$HOME/\\~}\""

The value of the variable HOSTNAME is successfully displayed because the two rules match consecutively: one rule matches in one line and the other in the next line. However, we also see that the GREETINGS variable value is partially displayed. Indeed, although the line does not start with the pattern declare --, the subsequent line (a substring of the value of the variable, c.f. World!") is displayed because the line (or "record") matches the second rule !/^declare/.

declare -- HOSTNAME="retro-
host"
World!"
declare -- PROMPT_COMMAND="printf \"\\033]0;%s@%s:%s\\007\" \"\${USER}\" \"\${HOSTNAME%%.*}\" \"\${PWD/#\$HOME/\\~}\""

Since multi-line values are subsequent, it is needed to display only some consecutive lines.

2. Check the value of the shell variable

It is appropriate to present now the algorithm used to better understand the complexity of the code.

First of all, you have to know that Awk checks all the rules each time it reads a record (by default, the current line). However, you have to consider the processing done previously (on the previous record). In other words, the program will need to know its state while it process the next line: what is the shell variable parsed? For this we define a "boolean" variable named scan.

The code is undoubtedly complex. It is an ordered sequence of conditional instructions mutually exclusive (like the logical XOR). For more information, read the section "Why the code is so convoluted?".

A. The variable has no value

if ($3 !~ /=/) {
    scan = 0
    print $3
}

The third field is not a shell variable assignment if it doesn't contain an equal sign, it is only a variable name.

B. The variable has a value

The value of a variable can contain special characters which are escaped using double quotes.

There are two types of variable assignment: the declaration of a scalar variable and the declaration of an array variable.

# A common shell variable
VARIABLE="VALUE"
# A Bash array
VARIABLE=(VALUE)

Anyway, the value is delimited by a pair of specific characters, either " and " or ( and ). If the corresponding delimiters are on the same line, then the value of the variable does not contain newline characters. Otherwise the next line must be displayed up to the corresponding delimiter.

The opening delimiter is at a specific position stored in the beg variable while the closing delimiter must be searched (using the end variable).

match($3, /=./)
beg = substr($3, RSTART + 1, 1)
match($0, /.$/)
end = substr($0, RSTART, 1)

Note: match() is a built-in Awk function that returns the beginning and ending positions of the substring which matched with the pattern (using the predefined variables RSTART and RLENGTH). Here we save the character just after the equal sign in beg and the last character of the record in end.

a. The value contains no embedded newlines

The characters beg and end are paired delimiters.

if (match($0, /[[:alpha:]_][[:alnum:]_]*=[("].*[^\\][")]$/)) {
    scan = 0
    if (beg == "(") {

If the second and third tests are false, then it indicates not to scan the next line. Indeed, the Awk variable assignment scan = 0 is equivalent to the truth value "false".

In concrete terms, this means that lines containing substrings that are not associated with the selected patterns (wanted variables) are not displayed. If we take the example of the section "1. Select the matching lines", this allows not to display the substring World!" which is the value of GREETINGS (declare -x).

b. The value contains embedded newlines

if (beg == "(") {
    if (end != ")") {
        scan = 1
    }
}

This conditional instruction indicates that the value is not yet correctly delimited. Therefore, the delimiter is necessarily located on another line (a subsequent line which is consecutive to the current line). In this case, scan = 1 indicates to scan the next line, scan becomes "true" in a logical test.

else {
    scan = 0
    if (match($0, /[[:alpha:]_][[:alnum:]_]*=.*$/)) {
        scan = 1
    }
}

Strictly speaking, the assignment of the variable scan to 0 is not functionally useful but it is a reminder for the reader that by default this variable is set to 0. In fact, we have the same "sequence" in the corresponding if part but in this part the variable assignment scan = 0 is really useful (see the section "The value contains no embedded newlines").

C. Display the next (next...) line

;; TBD

Why this code is so convoluted?

This code is convoluted because it checks the various combinations of the delimiters. This is not complicated, nonetheless the tests are specific.

The Awk program

/^declare --/ {
    # Display shell variables with no value
    if ($3 !~ /=/) {
        scan = 0
        print $3
    }
    else {
        # Check if the value spread on several lines
        match($3, /=./)
        beg = substr($3, RSTART + 1, 1)
        match($0, /.$/)
        end = substr($0, RSTART, 1)
    if (match($0, /[[:alpha:]_][[:alnum:]_]*=[("].*[^\\][")]$/)) {
        scan = 0
        if (beg == "(") {
            if (end != ")") {
                scan = 1
            }
        }
        else if (beg == "\"") {
            if (end != "\"") {
                scan = 1
            }
        }
    }
    else {
        scan = 0
        if (match($0, /[[:alpha:]_][[:alnum:]_]*=.*$/)) {
            scan = 1
        }
    }
    print substr($0, RSTART, RLENGTH)
}

}

Display the multi-line value of a matching pattern

!/^declare/ && scan { # Check if this is the last substring of the variable value if ($0 ~ /[^\\][")]$/ || $0 ~ /\\[")]$/ || $0 ~ /^[")]$/) { match($0, /.$/) end = substr($0, RSTART, 1) if (end == ")") { if (beg == "(") { scan = 0 } } else if (end == """) { if (beg == """) { scan = 0 } } } print $0 }

gilaro
  • 56
  • This answer might be better appreciated if you (1) explained more what it is doing.  (2) Showed the input that goes with the output you show.  In particular, the output for FOO= is so ugly, I have no idea whether it’s correct.  (3) Shortened the examples.  AFAICS, you have five cases: (a) no data, (b) scalar, all on one line, (c) scalar with embedded newlines; i.e., multi-line, (d) array, all on one line, and (e) array with embedded newlines.  OK, maybe it becomes nine if you count backslash cases separately.  But why do you need >25 identifiers in your example?  … (Cont’d) – G-Man Says 'Reinstate Monica' Apr 04 '22 at 08:03
  • (Cont’d) …  If you are really handling that many test cases, please explain how they differ and what makes them noteworthy / test-worthy.  (3f) Explained what those things with no value (no =) are.  (4) Explained why you need to use LC_ALL=C. … … … … … … … … … … … … … … … … … … … … … … … … … Also, it might help if you wrote a function to do the beg-end test, and maybe coded it if ((beg == "(" && end == ")") || (beg == "\"" && end == "\"")) return(1) for brevity (without loss of clarity, IMO).  Or at least explain why your current beg-end code is so convoluted. – G-Man Says 'Reinstate Monica' Apr 04 '22 at 08:03
  • @G-ManSays'ReinstateMonica': I would have to simplify things but I would have liked to start from the beginning using lettered programming (except I don't know how to do that). – gilaro Apr 04 '22 at 18:35
1

You can do it easily with Perl:

declare -p | perl -lne'/^declare -- (.*)/ && print $1'

If you use ack, you can do this:

declare -p | ack '^declare -- (.*)' --output='$1'
-2
declare -p |awk -F "--" '/declare --/{print $NF}'

output

 MAX_CONNECTIONS_PER_SERVER="6"
 OPTERR="1"
 OSTYPE="linux-gnu"
  • 1
    Try this with any variable whose value contains --: TEST=hello--world for example. (Variables containing options are somewhat common.) – Stephen Kitt Mar 31 '22 at 07:57