Replace values in fifth column

Question

I have multiple text

Name 1:10:34 date short_id 10 
Name 1:10:45 date short_id 10
Name 1:20:54 date short_id 20
Name 1:30:43 date short_id 30
Name 1:40:43 date short_id 40
Name 1:40:13 date short_id 40
Name 1:20:01 date short_id 20
Name 1:10:01 date short_id 10

I want to replace the fifth column , but when I use sed 's/\b10\b/user1/g' the second column also changes

The output should look like this :

Name 1:10:34 date short_id user1
Name 1:10:45 date short_id user1
Name 1:20:54 date short_id user2
Name 1:30:43 date short_id user3
Name 1:40:43 date short_id user4
Name 1:40:13 date short_id user4
Name 1:20:01 date short_id user2
Name 1:10:01 date short_id user1

--- update ---

if there is no user1, there will be a name and the second column is just time, it has nothing to do with the name

something like this

Name 1:10:34 date short_id John
Name 1:10:45 date short_id John
Name 1:20:54 date short_id Robert
Name 1:30:43 date short_id Jennifer
Name 1:40:43 date short_id Mary
Name 1:40:13 date short_id Mary
Name 1:20:01 date short_id Robert
Name 1:10:01 date short_id John

how can that sed command generate a user2 ? can you edit the OP with a exact copy/paste of the sed command used ? — Archemar, Jun 20 '22 at 13:26
Please format the sed command you tried as code as well to make it clear what you have tried. — Bram, Jun 20 '22 at 13:27
You managed to confuse me. (-: Do you plan to write a substitute for each user id/name or is there a file with some »translation table« which id is which user? — Philippos, Jun 20 '22 at 16:24

steve · Answer 1 · 2022-06-21T07:00:12.753

Two more awk solutions.

$ awk '{$5="user"substr($5,1,1)}1' file
Name 1:10:34 date short_id user1
Name 1:10:45 date short_id user1
Name 1:20:54 date short_id user2
Name 1:30:43 date short_id user3
Name 1:40:43 date short_id user4
Name 1:40:13 date short_id user4
Name 1:20:01 date short_id user2
Name 1:10:01 date short_id user1
$

$ awk '{$5="user"$5/10}1' x1
Name 1:10:34 date short_id user1
Name 1:10:45 date short_id user1
Name 1:20:54 date short_id user2
Name 1:30:43 date short_id user3
Name 1:40:43 date short_id user4
Name 1:40:13 date short_id user4
Name 1:20:01 date short_id user2
Name 1:10:01 date short_id user1
$

score 3 · Answer 2 · answered Jun 20 '22 at 16:41

Okay, unfortunally, sed doesn't know that 10 is John. So if you don't want to write another substitution for each user, you might want to use a second file users.txt with a translation table like this:

10 John
20 Robert
30 Jennifer
40 Mary

Then you feed both files into this sed command:

sed '/^[0-9]* [[:alnum:]]*$/{H;d;};G;s/ \([0-9]*\)\n.*\n\1 \([[:alnum:]]*\)/ \2/;P;d' users.txt yourfile.log

The translation table is read first and stored into the hold space. For the other files, the hold space with the translations is appended and the replacement is performed if possible. Then P prints the line only to the first line break, so no mess is printed if a user ID is unknown.

If you are interested in a more detailed explanation, see How to perform replacements defined in one file on another file or feel free to ask.

Forrest Vigor · Answer 3 · 2022-06-21T01:00:14.253

2

awk 'gsub(/0/, "", $5) { print $1, $2, $3, $4, "user" $5 }' INPUT

gsub removes number "0" in the 5th column. Then awk prints column 1-4, and combines word "user" with 5th column (there is no "," between "user" and $5).

edited Jun 21 '22 at 01:00

answered Jun 20 '22 at 14:14

Forrest Vigor

137

1

Welcome to Unix & Linux! Brevity is acceptable, but fuller explanations are better. – Kusalananda Jun 20 '22 at 14:16
Thanks for your reminder. I had added explanation in my answer. – Forrest Vigor Jun 21 '22 at 01:03
As I understand the question, userx is just a placeholder for actual user names. – Philippos Jun 22 '22 at 07:29

Bram · Answer 4 · 2022-06-22T15:08:09.137

1

If all you want to change is the last column you don't need the g modifier for global and you will want to add the $ anchor to limit it to the last column.

I.e.: sed 's/\b10$/user1/'

That is assuming you'll want to change only the 10.

If the 10, 20 etc. match directly to user 1,2 etc. something like the following may work:

sed 's/\b$[1-4]$0$/user\1/'

edited Jun 22 '22 at 15:08

answered Jun 20 '22 at 13:30

Bram

2,459

5

Note that 10\b$ is the same as 10$, and that you added the g modifiers in even though saying they're not needed :-) – Kusalananda Jun 20 '22 at 13:39

score 1 · Answer 5 · edited Jun 20 '22 at 15:50

1

Using sed

sed 's/\(.*\) \(.\).*/\1 user\2/' input_file
Name 1:10:34 date short_id user1
Name 1:10:45 date short_id user1
Name 1:20:54 date short_id user2
Name 1:30:43 date short_id user3
Name 1:40:43 date short_id user4
Name 1:40:13 date short_id user4
Name 1:20:01 date short_id user2
Name 1:10:01 date short_id user1

edited Jun 20 '22 at 15:50

schrodingerscatcuriosity

12,396

answered Jun 20 '22 at 13:32

sseLtaH

2,786

Ed Morton · Answer 6 · 2022-06-20T14:04:00.820

Just use awk:

$ awk '!($NF in users){ users[$NF]="user"(++cnt) } { $NF=users[$NF] } 1' file
Name 1:10:34 date short_id user1
Name 1:10:45 date short_id user1
Name 1:20:54 date short_id user2
Name 1:30:43 date short_id user3
Name 1:40:43 date short_id user4
Name 1:40:13 date short_id user4
Name 1:20:01 date short_id user2
Name 1:10:01 date short_id user1

The above assumes you want each user to get a unique ID based on the order they appear in the input. If that's not what you need then edit your question to clarify your requirements.

jubilatious1 · Answer 7 · 2022-06-21T05:40:15.727

Using Raku (formerly known as Perl_6)

raku -ne 'given .words -> $w {put "$w.[0..3] ", (S/ (\d+) /user{$0.substr(0,1)}/ with $w.[4]) };'

OR

raku -ne 'put .[0..3], " ", (S/ (\d+) /user{$0.substr(0,1)}/ with .[4]) given .words;'

The Raku code here breaks lines into whitespace-separated words. The first 4 columns (index .[0..3]) are output. Then digits in the 5th column are matched, and substitute with the word user followed by the first digit.

[A nicety of the with .[4] clause in the column 5 substitution is that it tolerates missing values in that column].

Sample Input:

Name 1:10:34 date short_id 10 
Name 1:10:45 date short_id 10
Name 1:20:54 date short_id 20
Name 1:30:43 date short_id 30
Name 1:40:43 date short_id 40
Name 1:40:13 date short_id 40
Name 1:20:01 date short_id 20
Name 1:10:01 date short_id 10

Sample Output:

Name 1:10:34 date short_id user1
Name 1:10:45 date short_id user1
Name 1:20:54 date short_id user2
Name 1:30:43 date short_id user3
Name 1:40:43 date short_id user4
Name 1:40:13 date short_id user4
Name 1:20:01 date short_id user2
Name 1:10:01 date short_id user1

https://raku.org

Replace values in fifth column

7 Answers7