4

I want to copy a first column from a file(.txt) and replace it with first column of second file in text delimited using Unix code.

input file format file 1

01/01/2007
02/01/2007
03/01/2007
04/01/2007

file 2

20070101 10.2317  79.1638   6.0  26.7  20.9   0.8  14.0  98.6
20070102 10.2317  79.1638   5.6  26.5  20.8   1.9  13.6  98.0
20070103 10.2317  79.1638   7.5  27.7  20.8   0.1  15.8  96.4
20070104 10.2317  79.1638   8.1  26.0  19.6   0.0  15.5  94.1

output required

01/01/2007  10.2317   79.1638   6.0  26.7  20.9   0.8  14.0  98.6
02/01/2007  10.2317   79.1638   5.6  26.5  20.8   1.9  13.6  98.0
03/01/2007  10.2317   79.1638   7.5  27.7  20.8   0.1  15.8  96.4
04/01/2007  10.2317   79.1638   8.1  26.0  19.6   0.0  15.5  94.1
Jeff Schaller
  • 67,283
  • 35
  • 116
  • 255
  • Is this tab-delimited data? – Kusalananda Jan 18 '17 at 10:01
  • There are already 2 questions about this topic: http://unix.stackexchange.com/questions/145648/merging-columns-from-two-separate-files - http://unix.stackexchange.com/questions/16443/combine-text-files-column-wise - None of them helped you? –  Jan 18 '17 at 10:15
  • @nwildner They are similar in nature, but neither involves removing one column. – Kusalananda Jan 18 '17 at 10:23
  • 2
    Do you really need to past from file 1, or do you really want to just changed the date format in column 1 of line 2 (e.g. with sed 's,^\([0-9]\{4\}\)\([0-9]\{2\}\)\([0-9]\{2\}\),\3/\2/\1,' or similar)? – Toby Speight Jan 18 '17 at 11:45

3 Answers3

3
$ cut -d' ' -f 2- file2 | paste file1 -

The cut will remove the first column from file2 (-f 2- means "output fields (columns) 2 onwards only"). The result of that will be passed to paste which will put the contents of file1 in the first column. The - in the paste command is a placeholder for whatever is delivered on standard input (the pipe from cut in this case).

Step by step:

$ cut -d' ' -f 2- file2
10.2317  79.1638   6.0  26.7  20.9   0.8  14.0  98.6
10.2317  79.1638   5.6  26.5  20.8   1.9  13.6  98.0
10.2317  79.1638   7.5  27.7  20.8   0.1  15.8  96.4
10.2317  79.1638   8.1  26.0  19.6   0.0  15.5  94.1

$ cut -d' ' -f 2- file2 | paste file1 -
01/01/2007      10.2317  79.1638   6.0  26.7  20.9   0.8  14.0  98.6
02/01/2007      10.2317  79.1638   5.6  26.5  20.8   1.9  13.6  98.0
03/01/2007      10.2317  79.1638   7.5  27.7  20.8   0.1  15.8  96.4
04/01/2007      10.2317  79.1638   8.1  26.0  19.6   0.0  15.5  94.1

The cut command is expecting tab-delimited input, but since I copied and pasted from your question, it's space-delimited. If the original data is actually tab-delimited, remove the -d' ' from the cut command.

The paste command will add a tab between column 1 and 2 by default. If you want a space instead, use paste -d' ' file1 -.

In another question, it was asked how to use the already existing date in file2 here, and do away with the first file completely.

I ended up with

$ paste <( date -f <( cut -d ' ' -f 1 file2 ) +"%d/%m/%Y" ) \
        <( cut -d ' ' -f 2- file2 )
01/01/2007      10.2317  79.1638   6.0  26.7  20.9   0.8  14.0  98.6
02/01/2007      10.2317  79.1638   5.6  26.5  20.8   1.9  13.6  98.0
03/01/2007      10.2317  79.1638   7.5  27.7  20.8   0.1  15.8  96.4
04/01/2007      10.2317  79.1638   8.1  26.0  19.6   0.0  15.5  94.1

Note, this requires a shell that understands process substitution (<( ... )), like bash or ksh, and it also requires GNU's implementation of date.

A bit of explanation may be in order:

The process substitution <( ... ) more or less creates a temporary file containing the output of the command within the parenthesis (actually a FIFO under /dev/fd). So the whole command will go through two steps substitutions:

paste <( date -f output_of_cut1 +"%d/%m/%Y" ) \
      output_from_cut2

date -f filename will read the dates in file filename and format each of them according to the given format string.

Then:

paste output_from_date output_from_cut2

which will paste together the columns with the output from date as the first column and the output of the second cut as the other columns.

Kusalananda
  • 333,661
1

Depending on awk version, you can transform first column with:

awk '{$1=substr($0,7,2)"/"substr($0,5,2)"/"substr($0,1,4); print $0}'  FILE2.csv

example:

echo "20070101 10.2317  79.1638   6.0  26.7  20.9   0.8  14.0  98.6"  | awk '{$1=substr($0,7,2)"/"substr($0,5,2)"/"substr($0,1,4); print $0}'

returns:

 01/01/2007 10.2317 79.1638 6.0 26.7 20.9 0.8 14.0 98.6

Notes:

First part redefine the value of the first columns

 $1=substr($0,7,2)"/"substr($0,5,2)"/"substr($0,1,4)

and print $0 prints entire line with the first column redefined

valentin
  • 131
1

You can use awk for this:

awk 'NR==FNR{a[++i]=$1;next}{$1=a[++k]; print}' file1 file2

NR==FNR is a pattern meaning: {a[++i]=$1;next} is executed on each record(i.e. line) in file1.

a[++i]=$1;next means: save the first field of the record(i.e. line) in the array with the index of the record-number(i.e. the line-number of file1), then read in the next record

$1=a[++k]; print means: copy the content of the array with indx k into the first field of the current record (i.e. line) (which is always the k-th record (i.e. line) of file2).

FloHe
  • 860