How to extract a word that occurs between two keywords on a line?

Question

Suppose I have a code mentioned below.

module dut#(parameter type tp =int, tp x = 12 ) (int r , reg [7:0] rg);
endmodule
module mid (int r, reg [7:0] rg);
endmodule
module bin (int z, logix s);
endmodule
module med;
endmodule

I want to extract the words dut, mid, bin and med, characterized by being the words after the keyword module and before the symbol #, ( or ;, whichever comes first.

I want to accomplish this using only a csh script. Which regex can be used for that purpose?

so you also want med in module med; according to the answer you accepted? — αғsнιη, Jan 21 '21 at 10:33
Yes , I also want to include module med because , this will add another condition i.e semicolon (;). — Aakash, Jan 21 '21 at 11:03

score 2 · Answer 1 · answered Jan 20 '21 at 11:11

2

Using (gnu)grep:

 grep -Po 'module +\K\w+' file

answered Jan 20 '21 at 11:11

JJoao

12,170
1
23
45

AdminBee · Answer 2 · 2021-01-21T11:07:27.517

Another solution using sed:

$ sed -nE 's/^module +([^ (#;]+) *[#(;].*$/\1/p' filename
dut
mid
bin

This will extract the module name by replacing (s) the entire line with the expression found in the parentheses.

Currently,

it looks for lines that start with "module" (^module), followed by one or more spaces ( +), and then followed by a string of one or more characters that are not space, ( or #, ;. This string is placed in a "capture group" because its specification [^ (#;]+ is placed inside parentheses ( ... ). The regular expression then forces zero or more spaces ( *), then either a #, a ( or a ; ([#(;]), and then any number of any character up to the end of the line (.*$), for a line to be considered a match.
If a match is found, the replacement is printed (p), but the -n option ensures that lines without match are not printed by default.

If you want to learn more about regular expressions, take a look here e.g.

score 0 · Accepted Answer · answered Jan 20 '21 at 11:09

0

I'm not sure what you mean by saying only using csh script. Can you use standard programs? If yes then:

The simplest solution for me is using grep and awk.

grep "module \w*" -o filename | awk '{print $2}'
dut
mid
bin

answered Jan 20 '21 at 11:09

Alex Baranowski

1,181

If you cannot use grep or awk for some reason then I will try to find out other solution. – Alex Baranowski Jan 20 '21 at 11:10
Alex , i can't extract a word directly by naming grep dut, mid or bin. It must be generalized because name may differ. so the only thing fixed in a test cases is module , #, ( . so I want access between module & # or ( . – Aakash Jan 20 '21 at 11:31
So it works as expected. – Alex Baranowski Jan 20 '21 at 11:32

score 0 · Answer 4 · answered Jan 21 '21 at 10:28

0

with awk:

$ awk -F'[ \t#(;]+' '/^module/{ print $2}' infile
dut
mid
bin
med

print second field where separators are one of Space /Tab\t/#/; or ( characters and ignore repetition (+)

answered Jan 21 '21 at 10:28

αғsнιη

41,407

How to extract a word that occurs between two keywords on a line?

4 Answers4