grep: How to find Closing Bracket?

Question

One part of /usr/share/X11/xkb/symbols/us starts with xkb_symbols "dvorak" { and ends with the closing curly bracket }; which line-number I want to find.

partial alphanumeric_keys
xkb_symbols "dvorak" {

    name[Group1]= "English (Dvorak)";

    key <TLDE> { [       grave, asciitilde, dead_grave, dead_tilde      ] };

    key <AE01> { [          1,  exclam          ]       };
    key <AE02> { [          2,  at              ]       };
    key <AE03> { [          3,  numbersign      ]       };
    key <AE04> { [          4,  dollar          ]       };
    key <AE05> { [          5,  percent         ]       };
    key <AE06> { [          6,  asciicircum, dead_circumflex, dead_circumflex ] };
    key <AE07> { [          7,  ampersand       ]       };
    key <AE08> { [          8,  asterisk        ]       };
    key <AE09> { [          9,  parenleft,  dead_grave] };
    key <AE10> { [          0,  parenright      ]       };
    key <AE11> { [ bracketleft, braceleft       ]       };
    key <AE12> { [ bracketright, braceright,  dead_tilde] };

    key <AD01> { [  apostrophe, quotedbl, dead_acute, dead_diaeresis    ] };
    key <AD02> { [      comma,  less,   dead_cedilla, dead_caron        ] };
    key <AD03> { [      period, greater, dead_abovedot, periodcentered  ] };
    key <AD04> { [          p,  P               ]       };
    key <AD05> { [          y,  Y               ]       };
    key <AD06> { [          f,  F               ]       };
    key <AD07> { [          g,  G               ]       };
    key <AD08> { [          c,  C               ]       };
    key <AD09> { [          r,  R               ]       };
    key <AD10> { [          l,  L               ]       };
    key <AD11> { [      slash,  question        ]       };
    key <AD12> { [      equal,  plus            ]       };

    key <AC01> { [          a,  A, adiaeresis, Adiaeresis ]     };
    key <AC02> { [          o,  O               ]       };
    key <AC03> { [          e,  E               ]       };
    key <AC04> { [          u,  U               ]       };
    key <AC05> { [          i,  I               ]       };
    key <AC06> { [          d,  D               ]       };
    key <AC07> { [          h,  H               ]       };
    key <AC08> { [          t,  T               ]       };
    key <AC09> { [          n,  N               ]       };
    key <AC10> { [          s,  S               ]       };
    key <AC11> { [      minus,  underscore      ]       };

    key <AB01> { [   semicolon, colon, dead_ogonek, dead_doubleacute ] };
    key <AB02> { [          q,  Q               ]       };
    key <AB03> { [          j,  J               ]       };
    key <AB04> { [          k,  K               ]       };
    key <AB05> { [          x,  X               ]       };
    key <AB06> { [          b,  B               ]       };
    key <AB07> { [          m,  M               ]       };
    key <AB08> { [          w,  W               ]       };
    key <AB09> { [          v,  V               ]       };
    key <AB10> { [          z,  Z               ]       };

    key <BKSL> { [  backslash,  bar             ]       };
};

I can find the start of the environment which returns 192

grep -n 'xkb_symbols "dvorak"' /usr/share/X11/xkb/symbols/us | cut -d ":" -f1 > /tmp/lineNumberStartEnvironment

I do but blank output

# http://unix.stackexchange.com/a/147664/16920
grep -zPo 'pin\(ABC\) (\{([^{}]++|(?1))*\})' /usr/share/X11/xkb/symbols/us

Pseudocode

Go first to the linenumber given by file /tmp/lineNumberStartEnvironment.
Find the closing bracket of the thing located at the line of /tmp/lineNumberStartEnvironment.
- do this with the data content in the body but also with the complete file /usr/share/X11/xkb/symbols/us

Attempt for heredoc until next line [cas, Kusalananda]

I do where I do not know what I should put to the deliminator; -n returns blank too

sed -n -f - /usr/share/X11/xkb/symbols/us <<END_SED | cut -f1
/xkb_symbols "dvorak" {/,/^};/{
        /xkb_symbols "dvorak" {/=
        /^};/=
}
END_SED

but blank output.

Systems: Ubuntu 16.04
Grep: 2.25

So you're looking for the line number of the line that has just the closing brace for the xkb_symbols "dvorak" block, is that right? — Eric Renouf, Jun 17 '16 at 13:15
Yes. I am looking the line-number of the closing thing as you say. — Léo Léopold Hertz 준영, Jun 17 '16 at 13:22
Need details on your question:
1) Do you need to find the line number of the closing bracket "};" which belongs to the opening bracket for line starting with "xkb_symbols" ?
2) Will your file be always at the same level of indentation as the one you posted - i.e - will the closing bracket always be at the first column of indentation? What other possible contents of the input file can there be? This will help in providing you a generic bash shell solution to your problem. — gawkface, Jun 17 '16 at 13:17
What do you want to do with the closing bracket? I would use vim . . . da{ to cut whatever is in the brackets. — Law29, Jun 17 '16 at 20:12

Kusalananda · Accepted Answer · 2016-06-19T07:01:22.270

5

This sed script prints the line number of the line matching /^};/ in the range of lines from /xkb_symbols "dvorak" {/ to the next /^};/ (which will be the same }; as the one we get the line number for):

/xkb_symbols "dvorak" {/,/^};/{
        /^};/=
}

If you need both start and end line numbers:

/xkb_symbols "dvorak" {/,/^};/{
        /xkb_symbols "dvorak" {/=
        /^};/=
}

$ sed -n -f tiny_script.sed /usr/share/X11/xkb/symbols/us
192
248

Alternatively:

$ sed -n -f - /usr/share/X11/xkb/symbols/us <<END_SED
/xkb_symbols "dvorak" {/,/^};/{
        /xkb_symbols "dvorak" {/=
        /^};/=
}
END_SED

EDIT: To get these two numbers in a variable, assuming you're using Bash:

pos=( $( sed -n -f - /usr/share/X11/xkb/symbols/us <<END_SED
        /xkb_symbols "dvorak" {/,/^};/{
                /xkb_symbols "dvorak" {/=
                /^};/=
        }
END_SED
) )

echo "start = " ${pos[0]}
echo "end   = " ${pos[1]}

Also, hi! Another Dvorak user!

edited Jun 19 '16 at 07:01

answered Jun 17 '16 at 13:21

Kusalananda

333,661

@Masi Rather than using line numbers, the sed script looks for the start of the "dvorak" section. – Kusalananda Jun 17 '16 at 13:29
@Masi, put all your code for that question in a convenient script, and feed the sed script to sed via a here-document in the script. – Kusalananda Jun 17 '16 at 13:36
@Masi see modified answer – Kusalananda Jun 17 '16 at 13:40
sed ... >outputfile <<END_SED, is that what you're after? – Kusalananda Jun 17 '16 at 14:59
1

Or sed ... <<END_SED | somepipe – Kusalananda Jun 17 '16 at 15:02
1

Too many >? I don't understand. Ah, you are typing it interactively? The > you're probably seeing is the secondary prompt which is used when a single command is carried over on multiple lines. The command will be executed as soon as you type END_SED after the sed script. – Kusalananda Jun 17 '16 at 15:15
1

If you want to use a heredoc AND pipe the output to another command, the pipe has to be on the same line as the command taking stdin from the heredoc. e.g. sed ... <<END_SED | somepipe args .... The heredoc doesn't begin until the next line. – cas Jun 18 '16 at 02:22
1

@Masi He means that tho contents of the here-doc, i.e. that which is passed to sed as its script in this case, does not start until the line containing <<END_SED, which means that to pipe the output from sed, you just add the pipe after (and on the same line as) <<END_SED. [oh, you withdrew that comment, oh well] – Kusalananda Jun 19 '16 at 06:28
@Masi I'm tempted to give up on this. It's perfectly simple. sed -n -f - <<END_SED | whatever-pipe-command(s)-you-want followed by the contents of the sed script on the next few lines (either the first script that I wrote for the end line number, or the other one that gives you both start and end line), followed by END_SED. I don't see what the problem is anymore. – Kusalananda Jun 19 '16 at 06:34
@Masi, I finally understood what it was you wanted (I hope). See updated answer. – Kusalananda Jun 19 '16 at 07:02

cas · Answer 2 · 2016-06-17T14:35:14.547

4

#! /usr/bin/awk -f

/"dvorak"/ {dvorak++};

/{/ && dvorak {b++} ;

/}/ && dvorak {b--} ;

dvorak && b == 0 && NR > 1 {
    print NR;
    exit
}

$ ./find-dvorak.awk /usr/share/X11/xkb/symbols/us
248

This uses a counter (b) which gets incremented every time it sees an open-curly-bracket { and decremented whenever it sees a close-curly-bracket }. It also uses a flag variable (dvorak) to know if it is inside the "dvorak" stanza or not.

When b == 0 and the line number is greater than one, print the line number.

BUGS: This does not account for commented-out brackets or those embedded in strings.

If you want the line numbers of the opening AND closing brackets:

#! /usr/bin/awk -f

/"dvorak"/ {dvorak++};

/{/ && dvorak {
    b++;
    if (!first++) {
        print NR
    }
} ;

/}/ && dvorak {b--} ;

dvorak && b == 0 && NR > 1 {
    print NR;
    exit
}

$ ./find-dvorak2.awk /usr/share/X11/xkb/symbols/us
192
248

Here's a version that allows you to search for any xkb_symbols stanza:

#! /usr/bin/awk -f

match($0,"xkb_symbols.*\""search"\"")  {found++};

/{/ && found {
    b++;
    if (!first++) {
        print NR
    }
} ;

/}/ && found {b--} ;

found && b == 0 && NR > 1 {
    print NR;
    exit
}

$ ./find-xkb_symbols.awk -v search=dvorak-intl /usr/share/X11/xkb/symbols/us
255
314

edited Jun 17 '16 at 14:35

answered Jun 17 '16 at 13:22

cas

78,579

1

There's no reason why you can't do it with sed. I think you've misinterpeted @Kusalananda's answer (there is no > before the END_SED, it's the end of a heredoc, not a redirection). For simple things like this, awk vs sed is often a matter of personal preference. I like the verbose readability and procedural style of awk. It's easier to read and understand months later. Other times, whether I use sed or awk (or perl) depends on which one i happen to think of first, or which one seems more fitting for the job. – cas Jun 17 '16 at 14:05
1

also, it would be trivially easy to modify either of the above scripts to search for any stanza, just add -v search=dvorak-intl to the command line (before any filenames) when running the script, change the /"dvorak"/ {dvorak++}; line to match($0,"\""search"\"") {found++}; and change all other occurences of the dvorak variable to found. – cas Jun 17 '16 at 14:13
1

i wouldn't use this script to insert a line. its job is to find a line number. I'd put that line number into a variable (e.g. $l), and do something like printf "%s\n" "${l}i" 'level3(ralt_switch' w | ed filename – cas Jun 17 '16 at 14:20
you want to insert a line before line 69684? use 69684i in the printf ... | ed ... example above. if you're using GNU ed, see info ed or pinfo ed for full docs (the GNU ed man page is not much more than a referral to the info docs) – cas Jun 17 '16 at 14:29
1

One final comment. change both print NR lines to print NR": "$0; to print the matching lines as well as the line numbers. – cas Jun 17 '16 at 14:33
1

for info about editing files in-place with printf and ed: http://unix.stackexchange.com/questions/287593/overwrite-specific-line-in-file-2-with-content-of-file-1/287618#287618 or http://unix.stackexchange.com/questions/281492/how-to-add-a-new-text-line-at-the-first-line-of-a-file/281501#281501 BTW I forgot the . before the w in my comment above. my mistake, it's required to terminate the insert. – cas Jun 17 '16 at 14:40
I did not understand your printf ... ed example. I cannot understand your expected outputs there. Etc if you run it on /usr/share/X11/xkb/symbols/us. I would like to know what is your expected output. – Léo Léopold Hertz 준영 Jun 17 '16 at 14:54
1

The printf | ed is just a way to make a scripted edit of a file. It's unrelated (in fact, irrelevant) to your actual question. The two links I posted (and many others searchable on this site) explain what the printf | ed does and how it works. If you have a specific question about it, then post a new question. It's 1am here, so I'm going to get some sleep...if nobody's answered your new question tomorrow, i'll probably get to it. – cas Jun 17 '16 at 15:03

grep: How to find Closing Bracket?

Attempt for heredoc until next line [cas, Kusalananda]

2 Answers2

Linked