Command for converting rows to CSV file

Question

I have a file of the format, with a leading space before each line:

 "Western Overseas",
 "Western Overseas",
 "^",
 "--",
 "^",
 "--",
 "--",
 null,
 24995,
 9977,
 "CR",

 "Western Refrigeration Private Limited",
 "Western Refrigeration Private Limited",
 "[ICRA]A",
 "--",
 "[ICRA]A1",
 "--",
 "Stable",
 null,
 14951,
 2346,
 "CR",

I would like to convert it to a CSV file with format:

 "Western Overseas","Western Overseas","^","--","^","--","--",null,24995,9977,"CR"
 "Western Refrigeration Private Limited","Western Refrigeration Private Limited","[ICRA]A","--","[ICRA]A1","--","Stable",null,14951,2346,"CR"

I'm trying to use tr but am having trouble since it either prints all output to one line and seems to replace newlines with a double newline. Any help is appreciated.

(1) Edited to reflect two lines instead of 3. (4) There are no leading spaces on the empty line, only at the beginning of lines with actual text. — user362513, Jul 16 '19 at 07:47
Please provide the first few lines of a hexdump from the actual file. Something like xxd test.txt | head -n 12. — Kamil Maciorowski, Jul 16 '19 at 08:41
00000010: 6173 222c 0a20 2257 6573 7465 726e 204f as",. "Western O 00000020: 7665 7273 6561 7322 2c0a 2022 5e22 2c0a verseas",. "^",.``` — user362513, Jul 16 '19 at 08:48
awk '$1=$1' RS=',\n\n' infile if you don't mind last comma for last line. — αғsнιη, Jul 16 '19 at 13:17

score 5 · Answer 1 · answered Jul 16 '19 at 10:53

An awk solution is

awk '{if(NF){gsub(/^ |,$/,""); printf c $0; c=","}else{printf "\n"; c=""}};END{printf "\n"}'

expanded with comments:

{
    if(NF) { # if the line isn't empty
        gsub(/^ |,$/,""); # remove the first space and last comma
        printf c $0; # print the line (without a newline)
        c="," # set c to add a comma for the next field
    } else {
        printf "\n"; # empty line, output a newline
        c="" # don't print a comma for the next entry
    }
};
END {
    printf "\n" # finish off with a newline
}

Kamil Maciorowski · Answer 2 · 2019-07-16T09:08:23.367

1

<file sed '
   :start
   s/\n$//
   t
   s/\n //
   N
   b start
  ' | sed 's/,$//'

The first sed loops (:start, b start) and appends lines to its pattern space (N) until a newline at the very end is found and deleted (s/\n$//). This indicates an empty line was read, the tool exits the loop then (t). At each iteration any surviving newline (and a consecutive space) is removed anyway to concatenate lines (s/\n //).

The second sed removes trailing commas.

edited Jul 16 '19 at 09:08

answered Jul 16 '19 at 08:02

Kamil Maciorowski

21,864

This returns the same output as my input, without any commas. – user362513 Jul 16 '19 at 08:09
I used <test.txt where that is the file in question, and made no other changes to your command. – user362513 Jul 16 '19 at 08:29
I'm currently on MacOS, not quite sure what the sed differences might be between us. I repeated the test by copying the text as you did and have the same result. – user362513 Jul 16 '19 at 08:47
@user362513 My sed is GNU sed, I guess yours is not. Please try the current version of my code. Any difference? – Kamil Maciorowski Jul 16 '19 at 09:10
There appears to be a '^M' carriage return in my actual file. Do you have any idea how to get around that? – user362513 Jul 16 '19 at 09:54
@user362513 https://unix.stackexchange.com/a/32020/108618 – Kamil Maciorowski Jul 16 '19 at 10:42
@user362513 Oh, I deliberately asked for the output of xxd to rule out this possibility. You gave me one with proper UNIX line endings (0a) and no ^M (would be 0d). Strange. – Kamil Maciorowski Jul 16 '19 at 10:47
I was able to fix it by changing to ^M using your code. All works now! – user362513 Jul 16 '19 at 10:55

Command for converting rows to CSV file

2 Answers2