5

From this question copy and replace column using unix

I have tried to build a solution which works only from this file:

20070101 10.2317  79.1638   6.0  26.7  20.9   0.8  14.0  98.6
20070102 10.2317  79.1638   5.6  26.5  20.8   1.9  13.6  98.0
20070103 10.2317  79.1638   7.5  27.7  20.8   0.1  15.8  96.4
20070104 10.2317  79.1638   8.1  26.0  19.6   0.0  15.5  94.1

And get the output:

01/01/2007  10.2317   79.1638   6.0  26.7  20.9   0.8  14.0  98.6
02/01/2007  10.2317   79.1638   5.6  26.5  20.8   1.9  13.6  98.0
03/01/2007  10.2317   79.1638   7.5  27.7  20.8   0.1  15.8  96.4
04/01/2007  10.2317   79.1638   8.1  26.0  19.6   0.0  15.5  94.1

Without other files involved.

So I want to substitute the first column by the converted date.

I got the date from the original file with:

$ awk '{print $1}' filedate.txt
20070101
20070102
20070103
20070104

Then I did the date conversion with:

for i in $(awk '{print $1}' filedate.txt); do date -d "$i"  +%d/%m/%Y; done
01/01/2007
02/01/2007
03/01/2007
04/01/2007

But I haven't been able to modify the first column of the file by the date values already converted. I tried with awk substitution (awk '{$1=$dt}1'):

for i in $(awk '{print $1}' filedate.txt); do dt=$(date -d "$i" +%d/%m/%Y) && awk '{$1=$dt}1' filedate.txt; done

But as there is a loop involved, the output is not the desired one.

How can I achieve this with awk? Is it possible to do the same with sed?

EDIT

In the comments of the other question I saw the following way with sed

 sed 's,^\([0-9]\{4\}\)\([0-9]\{2\}\)\([0-9]\{2\}\),\3/\2/\1,'

But I'm now curious about how to do it involving the date command.

6 Answers6

2

If the transformation you want is simply a reordering of the existing information, why not do

awk '{ $1=sprintf("%02i/%02i/%04i",
     substr($1, 7, 2), substr($1, 5, 2), substr($1, 1, 4)) }1' file

We are lifting substrings from the first field and reassembling them into a new value for the first field, then printing the entire input line as usual. (The lone 1 after the closing brace is a standard Awk idiom for unconditional printing.)

You'll probably regret converting perfectly good machine-readable dates to "human readable" very soon, though.

2

Let's use your already existing loop (which I haven't really looked at but which seems to do its job):

for i in $(awk '{print $1}' filedate.txt); do date -d "$i"  +%d/%m/%Y; done

And then tuck my answer to that other question onto it, with slight modification:

for i in $(awk '{print $1}' filedate.txt); do date -d "$i"  +%d/%m/%Y; done |
paste - <( cut -d ' ' -f 2- filedate.txt )

Result:

01/01/2007      10.2317  79.1638   6.0  26.7  20.9   0.8  14.0  98.6
02/01/2007      10.2317  79.1638   5.6  26.5  20.8   1.9  13.6  98.0
03/01/2007      10.2317  79.1638   7.5  27.7  20.8   0.1  15.8  96.4
04/01/2007      10.2317  79.1638   8.1  26.0  19.6   0.0  15.5  94.1

Shorter, without the loop:

date -f <( cut -d ' ' -f 1 filedate.txt ) +"%d/%m/%Y" |
paste - <( cut -d ' ' -f 2- filedate.txt )

Without the pipe:

paste <( date -f <( cut -d ' ' -f 1 filedate.txt ) +"%d/%m/%Y" ) \
      <( cut -d ' ' -f 2- filedate.txt )

All of these examples obviously requires either bash or ksh or any other shell that understands process substitutions. GNU date is also required.

See my answer to that other question for an explanation of how this works.

Kusalananda
  • 333,661
2

If you have GNU awk (gawk) you can replace a column using command output using the getline/variable/pipe form of getline to shell out to the date function:

gawk '{"date +%d/%m/%Y -d" $1 | getline $1} 1' file

However, if you just want to change the date format of a column, you can do that natively using the internal mktime and strftime functions:

gawk '{
  d = sprintf("%d %02d %02d 0 0 0", substr($1,1,4), substr($1,5,2), substr($1,7,2));
  t = mktime(d);
  $1 = strftime("%d/%m/%Y", t);
  } 1' file

although in this case you could do the required conversion using simple string manipulation (which should work in any flavor of awk):

$ mawk '{$1 = sprintf("%02d/%02d/%02d", substr($1,7,2), substr($1,5,2), substr($1,1,4))} 1' file
01/01/2007 10.2317 79.1638 6.0 26.7 20.9 0.8 14.0 98.6
02/01/2007 10.2317 79.1638 5.6 26.5 20.8 1.9 13.6 98.0
03/01/2007 10.2317 79.1638 7.5 27.7 20.8 0.1 15.8 96.4
04/01/2007 10.2317 79.1638 8.1 26.0 19.6 0.0 15.5 94.1
steeldriver
  • 81,074
2
sed 's| *\(....\)\(..\)\(..\)|\2/\3/\1|' < in > out
mikeserv
  • 58,310
2

GNU sed's eval modifier to the s command will allow you to use date to convert the dates as required:

sed -r 's|(\S+)(.*)|date -d \1 "+%d/%m/%y \2"|e'

The substituted expression is a well-formed date command. The e modifier causes this to be executed for each line and the pattern buffer (and thus the output) to be replaced with the output of each date command.

1

Although it may not sound the best practice, i would go for a different solution i.e using bash parameters handling like this :

gv@debi64:$ a="20070101"; b="${a: -2:2}/${a: -4:2}/${a: 0:4}";echo $b
01/01/2007

I could then read the file and using something like sed -i "s/$a/$b/g" i could achieve what you need:

gv@debi64:$ cat a.txt
20070101 10.2317  79.1638   6.0  26.7  20.9   0.8  14.0  98.6
20070102 10.2317  79.1638   5.6  26.5  20.8   1.9  13.6  98.0
20070103 10.2317  79.1638   7.5  27.7  20.8   0.1  15.8  96.4
20070104 10.2317  79.1638   8.1  26.0  19.6   0.0  15.5  94.1

gv@debi64:$ while IFS=" " read -r df rest;do ndf="${df: -2:2}/${df: -4:2}/${df: 0:4} "; sed -i "s#$df#$ndf#g" a.txt;done <a.txt

gv@debi64:$ cat a.txt
01/01/2007  10.2317  79.1638   6.0  26.7  20.9   0.8  14.0  98.6
02/01/2007  10.2317  79.1638   5.6  26.5  20.8   1.9  13.6  98.0
03/01/2007  10.2317  79.1638   7.5  27.7  20.8   0.1  15.8  96.4
04/01/2007  10.2317  79.1638   8.1  26.0  19.6   0.0  15.5  94.1