2

Suppose I have two text files - active-users.txt and all-user-info.txt

active-users.txt contains only numeric userIDs.

all-users.txt contains userIDs, and additional info fields.

What I need to do is to create a third text file which will contain the complete line of information for every userID in active-users.txt...

I've tried the following, in a bash script and on the command line:

for i in $(< active-users.txt)
do
grep $i all-users.txt >> active-user-info.txt
done

The broken bit that's driving me bonkers is that the active-user-info.txt output file always contains all of the contents of all-user-info.txt - and I'd expect it to only contain lines including the userIDs in activeUsers.txt

What am I missing?

Stephen Kitt
  • 434,908
  • 1
    This doesn't explain what's wrong with your attempt, but grep -f active-users.txt all-users.txt should do what you want... (I can turn this into an answer if that's good enough for you!) – Stephen Kitt Oct 11 '16 at 16:09
  • Actually your attempt works for me... But you'll match portions of ids. – Stephen Kitt Oct 11 '16 at 16:13
  • I've tried grep -f as you describe, and that also results in dumping the entire contents of all-users.txt to stdout, or to a file. – D Mooney Oct 11 '16 at 16:29
  • That means that the identifiers in active-users.txt do match all the lines in all-user-info.txt. Note that there's no structure here, so 1 in active-users.txt would match 1 anywhere in lines of all-user-info.txt: it would match 1 or 10 as an identifier, but also 1 in the extra information. – Stephen Kitt Oct 11 '16 at 16:39
  • Stephen - I've tried grep -f, and also tried using the -w to prevent portion matching... With just -f, I get the same result - output is the entire contents of the master file. When I add the -w, the output is null. – D Mooney Oct 11 '16 at 16:41
  • For clarity - each 'userID' is actually a cellphone SIM IMSI - a 15 digit number. The 'additional info' in all-user-info inludes PIN#'2, Auth keys, etc. as well as some text strings. – D Mooney Oct 11 '16 at 16:44
  • Also, the IMSI's are - in all but a few cases, sequential, with the first 11 numbers identical, and the last four integers incrementing from 0010 to 5010 (decimal).

    Are you saying that any portion of userID can match, unless I use the -w flag?

    – D Mooney Oct 11 '16 at 16:47
  • No, the user id would have to match in its entirety, but it can match against anything in the lines, not just the user id. See the suggested duplicate for more discussion (-Fwf all told). – Stephen Kitt Oct 11 '16 at 17:04

2 Answers2

3

Assuming active-users.txt has no blank lines:

grep -f active-users.txt all-users.txt > active-users-info.txt

If active-users.txt has one or more blank lines in it:

grep '.' active-users.txt | grep -f - all-users.txt > active-users-info.txt
agc
  • 7,223
1

This worked for me with test files

while read LINE; do
    grep "$LINE" all-users.txt >>active-users-info.txt
done <active-users.txt
Dalvenjia
  • 2,026