0

I have input file like below

ID~NAME~CREATED_DATE~NOTES~LAST_MODIFIED_DATE
"12345"~"abc"~"9/7/2022 10:05:18 AM"~"new patiant"~"9/7/2022 11:52:18 AM"
"25451"~"bdc"~"11/7/2022 10:05:18 AM"~"next 
month 
visit"~"11/7/2022 10:05:18 AM"
"45522"~"xyz"~"1/8/2022 11:05:18 AM"~"new visiting patient"~"1/8/2022 11:05:18 AM"
"52447"~"pqr"~"5/5/2022 10:05:18 AM"~"transferred
back 
to 
hospital"~"5/5/2022 10:05:18 AM"
"24541"~"rds"~"4/5/2022 05:05:18 AM"~"new patient"~"4/5/2022 05:05:18 AM"

below is my my desired output:

ID~NAME~CREATED_DATE~NOTES~LAST_MODIFIED_DATE
"12345"~"abc"~"9/7/2022 10:05:18 AM"~"new patiant"~"9/7/2022 11:52:18 AM"
"25451"~"bdc"~"11/7/2022 10:05:18 AM"~"next month visit"~"11/7/2022 10:05:18 AM"
"45522"~"xyz"~"1/8/2022 11:05:18 AM"~"transferred back to hospital"~"1/8/2022 11:05:18 AM"
"52447"~"pqr"~"5/5/2022 10:05:18 AM"~"new visiting patient"~"5/5/2022 10:05:18 AM"
"24541"~"rds"~"4/5/2022 05:05:18 AM"~"new patient"~"4/5/2022 05:05:18 AM"

please help !

Chris Davies
  • 116,213
  • 16
  • 160
  • 287
s v
  • 27

1 Answers1

1

Using GNU awk for multi-char RS and RT:

$ awk -v RS='([^~]+~){4}[^~]+\n' '{print gensub(/[[:space:]]+/," ","g",RT)}' file
ID~NAME~CREATED_DATE~NOTES~LAST_MODIFIED_DATE
"12345"~"abc"~"9/7/2022 10:05:18 AM"~"new patiant"~"9/7/2022 11:52:18 AM"
"25451"~"bdc"~"11/7/2022 10:05:18 AM"~"next month visit"~"11/7/2022 10:05:18 AM"
"45522"~"xyz"~"1/8/2022 11:05:18 AM"~"new visiting patient"~"1/8/2022 11:05:18 AM"
"52447"~"pqr"~"5/5/2022 10:05:18 AM"~"transferred back to hospital"~"5/5/2022 10:05:18 AM"
"24541"~"rds"~"4/5/2022 05:05:18 AM"~"new patient"~"4/5/2022 05:05:18 AM"
Ed Morton
  • 31,617
  • HI Ed, thanks for the quick reply. I tried your code but getting error : awk: 0602-553 Function gensub is not defined. The input line number is 1. The source line number is 1. – s v Oct 05 '22 at 06:31
  • A I said "Using GNU awk....". You aren't using GNU awk. – Ed Morton Oct 05 '22 at 12:04