csplit & paste
Use csplit
to break one file into multiple files at a pattern. Then use paste
to join the new files together.
awk 'NF' column.txt | csplit --suppress-matched -s -z -f INTERIM -n 4 - '/start newset/' '{*}' ; paste INTERIM* | expand -t 6,13 ; rm -f INTERIM*
The same code, reformatted for clarity:
awk 'NF' column.txt | \
csplit --suppress-matched -s -z -f INTERIM -n 4 - '/start newset/' '{*}' ;
paste INTERIM* | \
expand -t 6,13 ;
rm -f INTERIM*
Description:
awk 'NF' column.txt
Remove empty lines. Otherwise, empty lines in the input file would place extra column separators in the output.
- csplit
--suppress-matched
Don't include lines containing the splitting pattern in the output.
-s
Don't show summary information about the output files.
-z
Don't produce empty output files (ie, when two adjacent lines of the input file contain the splitting pattern).
-f INTERIM
Filenames of the split files begin with this string.
-n 4
Filenames of the split files end with a number containing this many digits.
-
Take input from STDIN
, since we're first running the input file through awk
.
'/start newset/'
Split the input file at the first line containing this regular expression.
'{*}'
Keep splitting the input file on every additional line containing that regular expression.
paste INTERIM*
Join the interim files.
expand -t 6,13
Adjust the column spacing between the joined files (eg, start the second file at column 6 and the third file at column 13).
rm -f INTERIM*
Delete the interim files.
Example input file column.txt
:
1 1.1
2 4.0
3 3.2
start newset
1 2.2
2 6.1
3 10.3
4 2.1
start newset
1 18.2
2 4.3
Example output:
1 1.1 1 2.2 1 18.2
2 4.0 2 6.1 2 4.3
3 3.2 3 10.3
4 2.1
It's a little more complicated if the lines of the input file and the final output are indented.
Example input file column.txt
:
1 1.1
2 4.0
3 3.2
start newset
1 2.2
2 6.1
3 10.3
4 2.1
start newset
1 18.2
2 4.3
- Change
awk 'NF'
to awk 'NF { sub(/^ +/,"",$0) ; print $0 }'
to remove the indentation before further processing.
- Change
expand -t 6,13
to awk '{ print " " $0 }' | expand -t 8,15
to indent the output.
Example output:
1 1.1 1 2.2 1 18.2
2 4.0 2 6.1 2 4.3
3 3.2 3 10.3
4 2.1
paste
command result to a file and open with a text editor, that will display correctly – αғsнιη May 20 '18 at 12:01paste tmp-*| column -s $'\t' -tn
. The flagged question is exactly answer to your question – αғsнιη May 20 '18 at 12:15