6

I have a file like this:

ID  A56
DS  /A56
DS  AGE 56

And I'd like to print the whole line only if the second column starts with a capital letter.

Expected output:

ID  A56
DS  AGE 56

What I've tried so far:
awk '$2 ~ /[A-Z]/ {print $0}' file
Prints everything: capital letters are found within the second column.

awk '$2 /[A-Z]/' file
Gets a syntax error.

dovah
  • 1,717

2 Answers2

11

You must use regex ^ to denote start of string:

$ awk '$2 ~ /^[[:upper:]]/' file
ID  A56
DS  AGE 56
cuonglm
  • 153,898
5

You could use awk as @cuonglm suggested, or

  1. GNU grep

    grep -P '^[^\s]+\s+[A-Z]' file 
    
  2. Perl

    perl -lane 'print if $F[1]=~/^[A-Z]/' file
    
  3. GNU sed

    sed -rn '/^[^\s]+\s+[A-Z]/p' file 
    
  4. shell (assumes a recent version of ksh93, zsh or bash)

    while read -r a b; do 
        [[ $b =~ ^[A-Z] ]] && printf "%s %s\n" "$a" "$b"; 
    done < file 
    
cuonglm
  • 153,898
terdon
  • 242,166
  • That assumes GNU grep, GNU sed and for the last one, recent versions of ksh93 zsh or bash and that the file doesn't contain backslash characters. Except for the perl one what [A-Z] matches depends on the locale and doesn't make much sense except in the C locale. – Stéphane Chazelas Jul 25 '14 at 11:49
  • @StéphaneChazelas so -P is a GNU extension? OK. Why does the [A-Z] not make sense? Presumably, the OP would want whatever is defined as a capital letter in their locale right? I added -r for the backslashes. – terdon Jul 25 '14 at 11:56
  • 1
    backslashes still an issue with Unix-conformant echos. Yes -P is a GNU extension though it's found as well in some BSDs that have forked or rewritten their GNU grep. See there for [A-Z]. Also note that \s is [[:space:]], not [[:blank:]]. – Stéphane Chazelas Jul 25 '14 at 12:03
  • @StéphaneChazelas I see, thanks. I switched to printf then, just in case. – terdon Jul 25 '14 at 12:09