Input----
System:root,bin,user,saaa
Displayed output----
System,root
System,bin
System,user
System,says
How to get this output??
Input----
System:root,bin,user,saaa
Displayed output----
System,root
System,bin
System,user
System,says
How to get this output??
I would recommend that you use perl
, but since you specified shell scripting...
Step 1: split your line into two parts, based on the :
character. Use the cut
command or the ${parameter#word}
and ${parameter%word}
constructs.
Step 2: split the second part of your line into multiple parts, based on the ,
character. Use the awk
command -- it should tell you how many pieces you will have (I'm not an awk
expert, so I'm not sure that this will work the way I envision it.)
Step 3: cycle through the various parts you get from Step 2, attach them to the first part from Step 1, and print.
If we can assume that your input lines contain exactly one colon (:
), that no comma (,
) may appear before the colon and that neither commas nor colons may be part of the extracted substrings (not even escaped), then a simple awk
script may be enough:
$ printf '%s\n' 'System:one,two,three' |
awk -v FS=':|,' '{ for (i=2;i<=NF;i++) { print $1","$i } }'
Output:
System,one
System,two
System,three
The field separator FS
is an extended regular expression that will split on every character that is :
or ,
.
If, instead, you want to pick everything up to the first colon (possibly including commas) as the first output field and split the remainder of any input line at any comma (assuming that no comma is intended to be preserved as part of any of the substrings (not even escaped)), you can resort to shell features, as suggested in hymie's answer:
$ printf '%s\n' 'System:one,two,three' |
while IFS= read -r rem; do # IFS= to preserve blank characters
first=${rem%%:*} # Remove from the first ':' on
rem=${rem#"$first"} # Remove first from the beginning of rem
rem=${rem#:} # Strip the remaining ':' at rem's beginning
while test "$rem"; do # Exit when rem is empty
second=${rem%%,*} # Remove from the first ',' on
rem=${rem#"$second"} # Remove second from the beginning of rem
rem=${rem#,} # Strip the remaining ',' at rem's beginning
printf '%s\n' "$first,$second"
done
done
Just make sure you understand the caveats of using shell loops to process text.
Alternatively, with GNU sed
:
$ printf '%s\n' 'System:one,two,three' |
sed -n '
:l1
s/^[^:]*:\n//g;
t l2;
s/^\([^:]\{1,\}\):\([^,\n]\{1,\}\)*,\{0,1\}\(.*\)$/\1:\3\n\1,\2/;
t l1;
q;
:l2 p;
'
Here, branching (t
) to a label (l1
) is used to process each line of input with a loop. One at a time, the substrings between the first :
and the first following ,
are appended as new lines to the pattern space, concatenated after the substring that comes before the first :
. When there is no further substring to extract, what remains of the original string is removed, the pattern space is printed and the program exits.
(With GNU sed
version >= 4.6 you can look at what is happening, step by step, by invoking it with the --debug
option).
Note that the use of \n
inside bracket expressions to match (here, negate a match) a <newline>
character is non-standard: POSIX states that <backslash>
shall lose any special meaning in that context.
System
a fixed string? Or may it be anything? Is the first column in your desired output intended to be anything word before:
? Is it correct to state that you trying to repeat the word that sits before:
on each output line, followed by just one of the words separated by commas in your input? Are those always comma-separated? – fra-san May 07 '19 at 18:28saaa
tosays
? If it's a typo please fix it. – Chris Davies May 07 '19 at 19:55