0

My /etc/passwd has a list of users in a format that looks like this:

username:password:uid:gid:firstname.lastname, somenumber:/...

Goal : I want to see only the first names and than sort them having the most common name appear first, 2nd most common appear 2nd etc....

I saw some solutions as to how to do the 2nd part, although they are relevant to working with a text file and not to reading from a map.

In regards to the first part, I really don't know how to approach this. I know that there are some solutions but don't really know how to do them.

asaf92
  • 292
  • 3
  • 14

2 Answers2

6

One way to do it:

cut -d: -f5 /etc/passwd | \
    sed 's/\..*//' | \
    sort -i | \
    uniq -ci | \
    sort -rn
Satō Katsura
  • 13,368
  • 2
  • 31
  • 50
  • Great answer, but I think he'll be in need of using uniq without -i, since there should be difference between X and x in name, we only need --ignore-case option for sort as you've used. In addition, using the sed command you've added in your answer, seems irrelevant, if there is any reason, please explain. –  Aug 09 '16 at 08:07
  • @FarazX Re: -i: John.Doe should be the same as john.doe. Re: sed: from the OP: I want to see only the first names. – Satō Katsura Aug 09 '16 at 08:13
  • Oh you're right, sorry I didn't notice. So voila! Thanks for your explanation, and your great way of using cut ;) –  Aug 09 '16 at 08:14
  • cut + sed is too much sed '/\n/{P;d};s/:/\n/4;s/\./\n/;D' or sed 's/[^.]*:\(\w\+\).*/\1/' – Costas Aug 09 '16 at 08:23
  • @Costas Too much compared to what? For me, total time spent thinking about getting the 5th field portably with sed >> the time gained by not using cut. BTW, your second recipe assumes GNU sed (\w). – Satō Katsura Aug 09 '16 at 08:31
  • @SatoKatsura The above is example. If you'd like you can do the same as in your script sed 's/[^.]*://;s/\..*//'. But my 1st example a little bit quicker. AND if you don't like \w you free to use [:alnum:] – Costas Aug 09 '16 at 08:38
  • @Costas sed 's/[^.]*://;s/\..*//' misses any names without dot. The point of using cut is precisely to avoid going into this kind of details, you know. – Satō Katsura Aug 09 '16 at 08:44
  • @SatoKatsura If you insist s/\([^:]*:\)\{4\}//;s/[:.].*// In any way if you involve sed you can easily avoid cut – Costas Aug 09 '16 at 09:46
  • Can u explain briefly how this command works? (specifically the sed and cut part) – asaf92 Aug 09 '16 at 13:29
  • And btw, in my system I don't have access to passwd. I have to type ypcat passwd to read it. – asaf92 Aug 09 '16 at 13:31
  • @PanthersFan92 cut extracts the 5th field, sed kills the .lastname, somenumber part out of it. You can, of course, do it like this: ypcat passwd | cut -d: -f5 | .... – Satō Katsura Aug 09 '16 at 13:48
2

Using awk and sorting to have the most common name first:

awk -F: '{sub(/[.].*/, "", $5); a[$5]++} END{for (n in a)print a[n],n}' /etc/passwd | sort -nr

For a case-insensitive version:

awk -F: '{sub(/[. ,].*/, "", $5); a[tolower($5)]++} END{for (n in a)print a[n],n}' /etc/passwd | sort -nr

For those who prefer their commands spread over multiple lines:

awk -F: '
  {
    sub(/[.].*/, "", $5)
    a[$5]++
  }

  END{
    for (n in a)
      print a[n],n
  }
  ' /etc/passwd | sort -nr

How it works

  • -F:

    This makes : the field separator.

  • sub(/[.].*/, "", $5)

    This removes everything after the first period from field 5.

  • a[$5]++

    The count for the number of times this name has appeared is stored in associative array a. This increments the counter. For the case-insensitive version, this is replaced with a[tolower($5)]++.

  • END{for (n in a)print a[n],n}

    This prints the count and name for all the results that we have in array a.

  • sort -nr

    This sorts the output numerically in descending order.

John1024
  • 74,655