0

I would like to extract the last and first columns of the following file, which I will call filename.

A   0.400       1.0  1.0      3.0     0.0     1.00  2.0     5.0     1.0   1.00   0.0   0.0  gs100_bs050_rcrs100_rarcinf_core_0400mpc3_df
B   0.400       1.0  1.0      3.0     0.0     0.25  2.0     5.0     1.0   0.25   0.0   1.0  gs100_bs050_rcrs025_rarc100_core_0400mpc3_df
C   0.03021516  4.0  1.0      4.0     0.0     1.75  2.0     5.0     1.0   1.75  -0.5  -0.5  data_c_rh4_rs175_gs10_ra0_b05n_10k

I tried awk '{print $NF, " ", $1}' filename, expecting to obtain

gs100_bs050_rcrs100_rarcinf_core_0400mpc3_df A
gs100_bs050_rcrs025_rarc100_core_0400mpc3_df B
data_c_rh4_rs175_gs10_ra0_b05n_10k C

But I obtained instead

A0_bs050_rcrs100_rarcinf_core_0400mpc3_df
B0_bs050_rcrs025_rarc100_core_0400mpc3_df
C_c_rh4_rs175_gs10_ra0_b05n_10k

on Mac (gawk v 5.0.0) and the same with 3 extra spaces at the beginning of each line on Linux/Ubuntu (mawk 1.3.3). I also tried awk '{printf("%s %s\n", $NF, $1)}' filename, but got

 A100_bs050_rcrs100_rarcinf_core_0400mpc3_df
 B100_bs050_rcrs025_rarc100_core_0400mpc3_df
 Cta_c_rh4_rs175_gs10_ra0_b05n_10k

on Mac. Since the longest string of the last column is 45 characters long. So I then tried:
awk '{printf("%-50s %s\n", $NF, $1)}' filename, but this returned

     As050_rcrs100_rarcinf_core_0400mpc3_df
      Bs050_rcrs025_rarc100_core_0400mpc3_df
                Cgs10_ra0_b05n_10k

My example fits in the awk limits of 1024 characters per field and 100 fields. Could this be a bug in awk?

  • Your file is a DOS text file, with carriage-return characters before each newline (i.e. part of the $NF value). Outputting a carriage-return returns the cursor to the start of the line. Convert your data to Unix text with dos2unix. – Kusalananda Aug 30 '20 at 17:01
  • Interesting comment! I had no idea that the file I was working on would be in DOS format. But I was not the creator. Indeed, file filename returns ASCII text, with CRLF line terminators, while if I run file on the output of dos2unix I do obtain ASCII text. But when I view filename with emacs I saw nothing special. So dos2unix saved the day. Or simply: tr '\r' ' ' < filename > newfilename. Thank you for solving the issue! – gammarayon Aug 30 '20 at 20:51

0 Answers0