I would like to extract the commands from arbitrary shell scripts. I've used morbig (hat tip to Michael Homer for the suggestion!) to generate a JSON file from a shell script.
As an example, this shell script:
#!/bin/sh
echo hi
false || echo something
true && echo something
results in the following JSON:
[
"Program_LineBreak_CompleteCommands_LineBreak",
[ "LineBreak_Empty" ],
[
"CompleteCommands_CompleteCommands_NewlineList_CompleteCommand",
[
"CompleteCommands_CompleteCommands_NewlineList_CompleteCommand",
[
"CompleteCommands_CompleteCommand",
[
"CompleteCommand_CList",
[
"CList_AndOr",
[
"AndOr_Pipeline",
[
"Pipeline_PipeSequence",
[
"PipeSequence_Command",
[
"Command_SimpleCommand",
[
"SimpleCommand_CmdName_CmdSuffix",
[
"CmdName_Word",
[ "Word", "echo", [ [ "WordName", "echo" ] ] ]
],
[
"CmdSuffix_Word",
[ "Word", "hi", [ [ "WordName", "hi" ] ] ]
]
]
]
]
]
]
]
]
],
[ "NewLineList_NewLine" ],
[
"CompleteCommand_CList",
[
"CList_AndOr",
[
"AndOr_AndOr_OrIf_LineBreak_Pipeline",
[
"AndOr_Pipeline",
[
"Pipeline_PipeSequence",
[
"PipeSequence_Command",
[
"Command_SimpleCommand",
[
"SimpleCommand_CmdName",
[
"CmdName_Word",
[ "Word", "false", [ [ "WordName", "false" ] ] ]
]
]
]
]
]
],
[ "LineBreak_Empty" ],
[
"Pipeline_PipeSequence",
[
"PipeSequence_Command",
[
"Command_SimpleCommand",
[
"SimpleCommand_CmdName_CmdSuffix",
[
"CmdName_Word",
[ "Word", "echo", [ [ "WordName", "echo" ] ] ]
],
[
"CmdSuffix_Word",
[
"Word",
"something",
[ [ "WordName", "something" ] ]
]
]
]
]
]
]
]
]
]
],
[ "NewLineList_NewLine" ],
[
"CompleteCommand_CList",
[
"CList_AndOr",
[
"AndOr_AndOr_AndIf_LineBreak_Pipeline",
[
"AndOr_Pipeline",
[
"Pipeline_PipeSequence",
[
"PipeSequence_Command",
[
"Command_SimpleCommand",
[
"SimpleCommand_CmdName",
[
"CmdName_Word",
[ "Word", "true", [ [ "WordName", "true" ] ] ]
]
]
]
]
]
],
[ "LineBreak_Empty" ],
[
"Pipeline_PipeSequence",
[
"PipeSequence_Command",
[
"Command_SimpleCommand",
[
"SimpleCommand_CmdName_CmdSuffix",
[
"CmdName_Word",
[ "Word", "echo", [ [ "WordName", "echo" ] ] ]
],
[
"CmdSuffix_Word",
[ "Word", "something", [ [ "WordName", "something" ] ] ]
]
]
]
]
]
]
]
]
],
[ "LineBreak_Empty" ]
]
I would like to see output along the lines of:
echo
false
echo
true
echo
... ignoring for now any parameters, options, and arguments to the base commands. The order of the outputted commands does not matter. Bonus points if it's easy to make them unique before being output (saving a |sort -u
afterwards).
I've gotten as far as:
< simple.json jq flatten | grep -A2 CmdName_Word
but this feels like the wrong approach. I want to tell jq
to give me the word that follows "Word" that follows "CmdName_Word", but I don't know how to do that.
If you'd like to reproduce these steps locally (extracted from https://github.com/colis-anr/morbig):
(install docker per your OS)
docker pull colisanr/morbig:latest
define a shell function for ease of use:
morbig () { D=$(cd "$(dirname "$1")"; pwd) B=$(basename "$1") docker run \ -v "$D":/mnt \ colisanr/morbig:latest --as simple /mnt/"$B" }
ensure the directory that contains the shell script is writable by UID 1000 (the docker container runs as user "opam" inside the container, which has UID 1000).
morbig your-shell-script-here.sh
the resulting
your-shell-script-here.sh.sjson
JSON will be in the same directory as the shell script.
jq
command from your answer (piped tosort -u
) extracted 166 unique commands. This toolchain has been a huge help to this part of my project. – Jeff Schaller Apr 08 '22 at 19:58