4

I want to take first 500 characters of the first column and leave the rest of the columns as it is, using awk.

This is what I am trying to do:

awk '{print ${$1:1:500} $2,$NF;}' doc.txt

But I get syntax error:

awk: {print ${$1:1:500} $2,$NF;}
awk:         ^ syntax error
awk: {print ${$1:1:500} $2,$NF;}
awk:                           ^ syntax error
Raj
  • 643

2 Answers2

16

You're trying to pass (invalid) bash subscript syntax to awk, but awk isn't bash. Here's one way to do what you want:

awk '{ $1 = substr($1, 1, 500) } 1'

1 is just a way of returning true so that awk prints the line, substr() is the actual call that does the substring. From the documentation:

substr(string, start, length)

This returns a length-character-long substring of string, starting at character number start. The first character of a string is character number one.

For example, substr("washington", 5, 3) returns "ing". If length is not present, this function returns the whole suffix of string that begins at character number start. For example, substr("washington", 5) returns "ington". This is also the case if length is greater than the number of characters remaining in the string, counting from character number start.

If you need to keep the field separator, set OFS appropriately.

Chris Down
  • 125,559
  • 25
  • 270
  • 266
  • Thanks. How to retain my original delimiter which is a tab character? currently awk is using space as the delimiter – Raj Dec 30 '13 at 08:04
  • Got it.. awk -F $'\t' 'BEGIN {OFS = FS} { $1 = substr($1, 1, 500) } 1' doc.txt – Raj Dec 30 '13 at 08:07
4

Here you go:

awk '{$1 = substr($1, 1, 500)} 1'

Unfortunately, this has the drawback of messing up the field separators between the fields, for example consider this:

$ ls -l | awk '{print}'
total 88
-rw-r--r-- 1 jack jack     8 Jun 19  2013 qunit-1.11.0.css
-rw-r--r-- 1 jack jack 56908 Jun 19  2013 qunit-1.11.0.js
-rw-r--r-- 1 jack jack  4306 Dec 29 09:16 test1.html
-rw-r--r-- 1 jack jack  5476 Dec  7 08:09 test1.js

If I use my answer to keep only the first 3 characters of the first field, I get:

$ ls -l | awk '{$1 = substr($1, 1, 3)} 1'
tot 88
-rw 1 jack jack 8 Jun 19 2013 qunit-1.11.0.css
-rw 1 jack jack 56908 Jun 19 2013 qunit-1.11.0.js
-rw 1 jack jack 4306 Dec 29 09:16 test1.html
-rw 1 jack jack 5476 Dec 7 08:09 test1.js

The original whitespace between all fields is replaced with a simple space. If that's a problem for you, then you can try something like this:

$ ls -l | awk '{$0 = substr($1, 1, 3) substr($0, length($1) + 1)} 1'
tot 88
-rw 1 jack jack  4668 Jun 19  2013 qunit-1.11.0.css
-rw 1 jack jack 56908 Jun 19  2013 qunit-1.11.0.js
-rw 1 jack jack  4306 Dec 29 09:16 test1.html
-rw 1 jack jack  5476 Dec  7 08:09 test1.js
janos
  • 11,341