0

I've collected data with 3 fields. I want to print the 3rd field data into a single line. This is the data I'm getting.

$ cat file
1234  1234  dei_1/3,dei_2/3,dei_9/0,
dei_10/0,dei_8/4
2345  2345  dei_8/9,dei_5/6,dei_4/9
4244  4244  dei_0/9,dei_4/6,dei_4/1
4235  4235  dei_0/9,dei_4/6,dei_4/,de
i_9/7,dei_1/3,dei_2/3,dei_9/0

Expected Result:

1234  1234  dei_1/3,dei_2/3,dei_9/0,dei_10/0,dei_8/4
2345  2345  dei_8/9,dei_5/6,dei_4/9
4244  4244  dei_0/9,dei_4/6,dei_4/1
4235  4235  dei_0/9,dei_4/6,dei_4/,dei_9/7,dei_1/3,dei_2/3,dei_9/0

Codes I have so far

while read file; do if [[ $file == 1 ]]; then echo -n; fi; done 
chaos
  • 48,171
  • 1
    Can you show how you collect and print the data? That's what needs to get fixed. – choroba May 29 '18 at 07:25
  • Similar, but not quite the same: https://unix.stackexchange.com/questions/429314/how-to-merge-lines-broken-by-newlines-inside-a-double-quoted-field – Kusalananda May 29 '18 at 09:29

3 Answers3

1

The following script join with the previous line any line that doesn't start with 2 numbers:

$ awk -v ORS="" '$1~/^[0-9]+$/ && $2~/^[0-9]+$/ && NR>1{printf "\n"}1' file
1234  1234  dei_1/3,dei_2/3,dei_9/0,dei_10/0,dei_8/4
2345  2345  dei_8/9,dei_5/6,dei_4/9
4244  4244  dei_0/9,dei_4/6,dei_4/1
4235  4235  dei_0/9,dei_4/6,dei_4/,dei_9/7,dei_1/3,dei_2/3,dei_9/0

This relies on ORS(output record separator) that is reset to an empty string. The newline is added if the 2 first fields are numbers (and if it isn't the first line).

oliv
  • 2,636
1

Short sed approach:

sed -E 'N; s/\n([^[:space:]]*,[^[:space:]]+)/\1/' file

The output:

1234  1234  dei_1/3,dei_2/3,dei_9/0,dei_10/0,dei_8/4
2345  2345  dei_8/9,dei_5/6,dei_4/9
4244  4244  dei_0/9,dei_4/6,dei_4/1
4235  4235  dei_0/9,dei_4/6,dei_4/,dei_9/7,dei_1/3,dei_2/3,dei_9/0
0

A couple of awk approaches:

Store the most recent line that starts with a digit, append to it if the current line does not start with a digit

awk '
    /^[[:digit:]]/ {if (prev) print prev; prev=$0; next} 
    {prev = prev $0} 
    END {if (prev) print prev}
' file

Reverse the file. If a line starts with a non-digit, read the next line and append the previous line. Reverse the results. I assume a record is split at most 1 time

tac file | awk '/^[^[:digit:]]/ {this = $0; getline; $0 = $0 this} 1' | tac
glenn jackman
  • 85,964