sed: extracting data from the selected column

Question

I have a log file arranged in the following format:

# This file was created Thu Dec 17 16:01:26 2020
# Created by:
#                      :-) GROMACS - gmx gyrate, 2019.3 (-:
# 
# Executable:   /usr/local/bin/../Cellar/gromacs/2019.3/bin/gmx
# Data prefix:  /usr/local/bin/../Cellar/gromacs/2019.3
# Working dir:  /Users/gleb/Desktop/DO/unity_or_separation
# Command line:
#   gmx gyrate -f /Users/gleb/Desktop/DO/unity_or_separation/storage/7000_cne_lig177/1AllBoxes_7000_cne_lig177.xtc -s /Users/gleb/Desktop/DO/unity_or_separation/storage/7000_cne_lig177/lig_1AllBoxes_7000_cne_lig177.pdb -o /Users/gleb/Desktop/DO/unity_or_separation/storage/7000_cne_lig177/RG/RG_1AllBoxes_7000_cne_lig177.xvg
# gmx gyrate is part of G R O M A C S:
#
# God Rules Over Mankind, Animals, Cosmos and Such
#
@    title "Radius of gyration (total and around axes)"
@    xaxis  label "Time (ps)"
@    yaxis  label "Rg (nm)"
@TYPE xy
@ view 0.15, 0.15, 0.75, 0.85
@ legend on
@ legend box on
@ legend loctype view
@ legend 0.78, 0.8
@ legend length 2
@ s0 legend "Rg"
@ s1 legend "Rg\sX\N"
@ s2 legend "Rg\sY\N"
@ s3 legend "Rg\sZ\N"
         1    0.535827    0.476343    0.375777    0.453993
         2    0.509863    0.450424    0.333084    0.453975
         3     0.51779    0.374447     0.44955    0.440349
         4    0.535215    0.392331    0.442183    0.472716
         5    0.542371    0.468222    0.383178     0.47146
         6     0.49479    0.340223     0.42002     0.44437
         7    0.495905    0.370873    0.445952    0.394239
         8    0.518463    0.424257    0.400878    0.443746

From this data I need to ommit all lines contained comments (started from # and @), taking only the second column from the multi-column table in the bottom and eventually multiplying the values by 10:

#this is a second column after conversion
5.4
5.1
5.2
5.4
5.4
4.9
5.0
5.2

I can do it by combining sed + awk:

sed -i '' -e '/^[#@]/d' "${storage}"/"${experiment}"/RG/RG_${pdb_name}.xvg
awk '-F ' '{ printf("%.1f\n", $2*10) }' "${storage}"/"${experiment}"/RG/RG_${pdb_name}.xvg > "${storage}"/"${experiment}"/RG/RG_${pdb_name}..xvg

is it possible to do all steps using only sed (the first command), thus ommiting creating of new file (resulted from AWK)?

Please always mention your operating system. The specific implementations of tools like sed and awk differ across systems and we need to know what you are using to answer you well. — terdon, Dec 17 '20 at 16:02

Quasímodo · Accepted Answer · 2020-12-19T11:27:27.330

6

Sed is not made for arithmetics. You could try clumsy workarounds, but Awk is better in that regard:

awk '!/^[#@]/{printf("%.1f\n",$2*10)}' file

With GNU Awk, add -i inplace to edit the file inplace. If you don't have GNU Awk, you can use sponge

awk '!/^[#@]/{printf("%.1f\n",$2*10)}' file | sponge file

or use the good old overwriting (it's what happens under the hood anyway...)

awk '!/^[#@]/{printf("%.1f\n",$2*10)}' file > newfile &&
mv newfile file

edited Dec 19 '20 at 11:27

answered Dec 17 '20 at 15:55

Quasímodo

18,865
4
36
73

so In my case, the combination of the both sed + awk could be done using awk -i inplace '!/^[#@]/{printf("%.1f\n",$2*10)}' file.xvg ? assuming that I would like to edit the existing file and not create new one – Hot JAMS Dec 17 '20 at 16:01
Yes, with GNU Awk. If not with GNU awk, you could always redirect to a new file and then overwrite the old file by the new one (it is the standard procedure). Or use sponge from moreutils package. – Quasímodo Dec 17 '20 at 16:03
I've just tried and it works very well! could you please specify what is the difference between GNU and non-GNU Awk ? I am using MacOCS writting scripts in bash and always use commands like sed -i '' -e '/^[#@]/d' to edit instantly filles .. thanks again – Hot JAMS Dec 17 '20 at 16:29
@HotJAMS: macOS doesn't ship with GNU awk; it ships with an old-ish BSD version of awk. Apple's latest upgrade for awk was in Oct, 2007 according to this document.. OTOH, GNU awk is currently maintained. The good news is you don't have to use Apple's antique awk - MacPorts has a recent gawk available. You can then wonder why the world's largest corporation ships 13-year-old software with their pricey computers. – Seamus Dec 17 '20 at 23:12
@HotJAMS: Oh - I failed to mention the documentation - very important stuff for users! I learn best by example, and GNU awk has documentation to support learn-by-example. Apple has man awk - not overly helpful. Finally - see this – Seamus Dec 17 '20 at 23:33
yep, I've just checked, there is a gawk as well on my mac and as I had already told awk -i inplace '!/^[#@]/{printf("%.1f\n",$2*10)}' " either works very well on the same machine ! – Hot JAMS Dec 18 '20 at 09:15
@HotJAMS One of the differences is just that, that -i inplace is only available for GNU Awk (as far as I know). There are many more extensions that GNU Awk made and will only be found there. The basic operations described by POSIX must be carried out equally by all Awks, such that awk '!/^[#@]/{printf("%.1f\n",$2*10)}' file has to work for any Awk. – Quasímodo Dec 18 '20 at 10:49

sed: extracting data from the selected column

1 Answers1