awk-Printing column value without new line and adding comma

Question

input.txt

 EN1
 EN2
 EN3
 EN4
 EN5

output

EN1,EN2,EN3,EN4,EN5

I have tried awk.But it is not printing with comma

awk 'BEGIN { OFS = ","} { printf $1}' input.txt

I have GNU Awk 4.0.0 version

Strange, ORS should separate the line (records) with commas, it's hard to believe v 4.0 of gawk would change that .. but it looks that way based on your experience. — Levon, Aug 03 '12 at 22:05

Levon · Accepted Answer · 2012-08-03T21:57:30.457

8

awk 'BEGIN{ORS=","}1' input.txt

yields this:

EN1,EN2,EN3,EN4,EN5,

so is printing with a comma (so I'm not sure I understand your comment in your post about this not happening) though I suspect the trailing comma is a problem.

Tested with GNU Awk 3.1.7

edited Aug 03 '12 at 21:57

answered Aug 03 '12 at 21:32

Levon

11,384
4
45
41

It is printing without comma eg. EN1EN2EN3EN4EN5 – jack Aug 03 '12 at 21:47
what version of awk are you using? This is odd, since I pasted this in from the console. – Levon Aug 03 '12 at 21:49
2

The question has used OFS, this answer uses ORS .. that explains the difference in behaviour... printf doesn't produce an ORS – Peter.O Aug 03 '12 at 23:58

score 5 · Answer 2 · answered Dec 09 '14 at 16:43

5

I know, old topic, but I could not resist - here's yet another short and simple way to do this:

$ paste -sd, input.txt
EN1,EN2,EN3,EN4,EN5
$

Works on Linux and Solaris, maybe even on other platforms.

answered Dec 09 '14 at 16:43

Ralph Kirchner

51
1
1

score 4 · Answer 3 · edited Aug 03 '12 at 23:02

4

You can use tr in such situation.

tr '\n' ',' <input.txt

This replaces the final newline by a comma as well. To avoid this, on Linux, if you know that the input file does end with a newline:

<input.txt head -c -1 | tr '\n' ,

Add ; echo if you want the output to be terminated by a newline.

Alternatively, you can get the shell to remove a trailing comma if there is one.

columns=$(<input.txt tr '\n' ',')
echo "${columns%,}"

edited Aug 03 '12 at 23:02

Gilles 'SO- stop being evil'

829,060

answered Aug 03 '12 at 21:55

rush

27,403

Oddly enough, this gives me an "extra operand" error message, but tr '\n' ',' < input.txt works fine, but also with a trailing comma. – Levon Aug 03 '12 at 21:59
@Levon See my edit – Gilles 'SO- stop being evil' Aug 03 '12 at 23:02
@Gilles Neat .. works nice (don't come often across a command that starts with <) – Levon Aug 07 '12 at 15:09

cas · Answer 4 · 2012-08-04T01:27:40.397

There's also xargs and sed:

$ xargs <input.txt | sed -e 's/ /,/g'
EN1,EN2,EN3,EN4,EN5

An advantage here is that there is no trailing comma to get rid of.

xargs to combine the input lines, sed to replace all spaces with commas. I use this routinely to construct regular expressions (replace spaces with |) and quick sums to pipe into bc (replace spaces with +).

(FYI xargs defaults to echo as the command if none is provided)

NOTE: This only works if the input file is as described (one field per line, no spaces). If there are more fields and/or spaces in the input you can use awk or sed to pre-process the input. For example, with input like this:

EN1 foo bar
EN2 bar foo
EN3 baz quux
EN4 abc def
EN5 hij klm

Here awk is used to extract only the first field:

$ awk '{print $1}' input.txt | xargs | sed -e 's/ /,/g'
EN1,EN2,EN3,EN4,EN5

In this second (sed) example, spaces in the original input are replaced with some other string (chosen as unlikely to be in the original input), then fed into xargs. sed then replaces the spaces added by xargs, and then restores the strings from the input:

$ sed -e 's/ /--space--/g' input.txt | xargs | sed -e 's/ /,/g' -e 's/--space--/ /g'
EN1 foo bar,EN2 bar foo,EN3 baz quux,EN4 abc def,EN5 hij klm

Now for some gratuitous op-ed commentary:

One of the most useful pieces of knowledge about unix text processing tools is that you can and should should think of data as being almost infinitely malleable - you can transform it into whatever form you need either to provide input to another process or to produce the output you want or both.

This is part of the reason why unix people tend to hate proprietary data formats - it's not just a philosophical disapproval or a wish to avoid vendor lock-in, it's also the very pragmatic fact that they make it difficult for us to manipulate and use our data in ways that weren't foreseen by the software's developers.

It won't be a problem in most cases, and it is certainly very useful for what you have mentioned, but it does output a trailing \n, and if xargs needs to invoke echo multiple times due to command-line args limits, sed will introduce more spurious \ns; one for each extra call ...(+1 BTW) — Peter.O, Aug 04 '12 at 19:58
PS.. I just noticed that xargs echo -n avoids the \n issue... but sed can hit a memory limit (it did in my rather large scale tests), so it should be fine if you aren't dealing with gigabyte+ long command lines ;) — Peter.O, Aug 04 '12 at 20:20
yeah, well, xargs will split them long before command lines get to a gigabyte :). I've occasionally run into problems when generating huge command lines from find /really/stupidly/long/path/.../ | xargs but that's more an issue with the command i'm feeding it into than with xargs (e.g. 'du -sh' may generate multiple total lines which then need to be added), and is solvable with a suitable wrapper script or post-processing. — cas, Aug 05 '12 at 10:06

score 2 · Answer 5 · answered Mar 12 '16 at 05:40

A solution that doesn't print a comma at the end of the line:

{printf("%s", NR == 1 ? $0 : ","$0);} END {printf("\n");} file

Explanation

When the first line is seen (NR == 1), only it is printed; otherwise a comma and the line are sent as arguments to printf.

This solution uses AWK's ternary operator ?:, that is:

NR == 1 ? $0 : ","$0

If the NR variable is equal to 1, then it sends the first line as the argument to printf; else it sends a comma concatenated with the current line.

Peter.O · Answer 6 · 2012-08-04T19:37:24.647

1

perl -pe '(eof)?s/\s+$//:s/\s+$/,/' input.txt

output: no trailing \n

EN1,EN2,EN3,EN4,EN5

edited Aug 04 '12 at 19:37

answered Aug 04 '12 at 00:13

Peter.O

32,916

score 0 · Answer 7 · answered Apr 21 '19 at 23:05

sed -z 's/\n/,/g' input.txt

The -z option (only in sed version 4.2 or later) expects zero-byte as the end of record character instead of the newline character, treating this whole input file as a single line. Search for \n (newline character) and replace with comma. The g makes the search-replace global.

It converts the final newline character to a comma too, but it's easy to convert the final comma back:

sed -z 's/\n/,/g' input.txt | sed 's/,$/\n/'

The $ symbol marks the end of the line, so it replaces the final comma with a newline character.

Note that if your input file does contain any zero-bytes (often used in binary files to terminate strings) then this will see them as end of record markers. The example above shouldn't be affected, but in some situations it can give unexpected results.

awk-Printing column value without new line and adding comma

7 Answers7

Linked