76

I want to determine which process has the other end of a UNIX socket.

Specifically, I'm asking about one that was created with socketpair(), though the problem is the same for any UNIX socket.

I have a program parent which creates a socketpair(AF_UNIX, SOCK_STREAM, 0, fds), and fork()s. The parent process closes fds[1] and keeps fds[0] to communicate. The child does the opposite, close(fds[0]); s=fds[1]. Then the child exec()s another program, child1. The two can communicate back and forth via this socketpair.

Now, let's say I know who parent is, but I want to figure out who child1 is. How do I do this?

There are several tools at my disposal, but none can tell me which process is on the other end of the socket. I have tried:

  • lsof -c progname
  • lsof -c parent -c child1
  • ls -l /proc/$(pidof server)/fd
  • cat /proc/net/unix

Basically, I can see the two sockets, and everything about them, but cannot tell that they are connected. I am trying to determine which FD in the parent is communicating with which child process.

Totor
  • 20,040

8 Answers8

61

Note: I now maintain a lsof wrapper that combines both approaches described here and also adds information for peers of loopback TCP connections at https://github.com/stephane-chazelas/misc-scripts/blob/master/lsofc

Linux-3.3 and above.

On Linux, since kernel version 3.3 (and provided the UNIX_DIAG feature is built in the kernel), the peer of a given unix domain socket (includes socketpairs) can be obtained using a new netlink based API.

lsof since version 4.89 can make use of that API:

lsof +E -aUc Xorg

Will list all the Unix domain sockets that have a process whose name starts with Xorg at either end in a format similar to:

Xorg       2777       root   56u  unix 0xffff8802419a7c00      0t0   34036 @/tmp/.X11-unix/X0 type=STREAM ->INO=33273 4120,xterm,3u

If your version of lsof is too old, there are a few more options.

The ss utility (from iproute2) makes use of that same API to retrieve and display information on the list of unix domain sockets on the system including peer information.

The sockets are identified by their inode number. Note that it's not related to the filesystem inode of the socket file.

For instance in:

$ ss -x
[...]
u_str  ESTAB    0    0   @/tmp/.X11-unix/X0 3435997     * 3435996

it says that socket 3435997 (that was bound to the ABSTRACT socket /tmp/.X11-unix/X0) is connected with socket 3435996. The -p option can tell you which process(es) have that socket open. It does that by doing some readlinks on /proc/$pid/fd/*, so it can only do that on processes you own (unless you're root). For instance here:

$ sudo ss -xp
[...]
u_str  ESTAB  0  0  @/tmp/.X11-unix/X0 3435997 * 3435996 users:(("Xorg",pid=3080,fd=83))
[...]
$ sudo ls -l /proc/3080/fd/23
lrwx------ 1 root root 64 Mar 12 16:34 /proc/3080/fd/83 -> socket:[3435997]

To find out what process(es) has 3435996, you can look up its own entry in the output of ss -xp:

$ ss -xp | awk '$6 == 3435996'
u_str  ESTAB  0  0  * 3435996  * 3435997 users:(("xterm",pid=29215,fd=3))

You could also use this script as a wrapper around lsof to easily show the relevant information there:

#! /usr/bin/perl
# lsof wrapper to add peer information for unix domain socket.
# Needs Linux 3.3 or above and CONFIG_UNIX_DIAG enabled.

# retrieve peer and direction information from ss
my (%peer, %dir);
open SS, '-|', 'ss', '-nexa';
while (<SS>) {
  if (/\s(\d+)\s+\*\s+(\d+) ([<-]-[->])$/) {
    $peer{$1} = $2;
    $dir{$1} = $3;
  }
}
close SS;

# Now get info about processes tied to sockets using lsof
my (%fields, %proc);
open LSOF, '-|', 'lsof', '-nPUFpcfin';
while (<LSOF>) {
  if (/(.)(.*)/) {
    $fields{$1} = $2;
    if ($1 eq 'n') {
      $proc{$fields{i}}->{"$fields{c},$fields{p}" .
      ($fields{n} =~ m{^([@/].*?)( type=\w+)?$} ? ",$1" : "")} = "";
    }
  }
}
close LSOF;

# and finally process the lsof output
open LSOF, '-|', 'lsof', @ARGV;
while (<LSOF>) {
  chomp;
  if (/\sunix\s+\S+\s+\S+\s+(\d+)\s/) {
    my $peer = $peer{$1};
    if (defined($peer)) {
      $_ .= $peer ?
            " ${dir{$1}} $peer\[" . (join("|", keys%{$proc{$peer}})||"?") . "]" :
            "[LISTENING]";
    }
  }
  print "$_\n";
}
close LSOF or exit(1);

For example:

$ sudo that-lsof-wrapper -ad3 -p 29215
COMMAND   PID     USER   FD   TYPE             DEVICE SIZE/OFF    NODE NAME
xterm   29215 stephane    3u  unix 0xffff8800a07da4c0      0t0 3435996 type=STREAM <-> 3435997[Xorg,3080,@/tmp/.X11-unix/X0]

Before linux-3.3

The old Linux API to retrieve unix socket information is via the /proc/net/unix text file. It lists all the Unix domain sockets (including socketpairs). The first field in there (if not hidden to non-superusers with the kernel.kptr_restrict sysctl parameter) as already explained by @Totor contains the kernel address of a unix_sock structure that contains a peer field pointing to the corresponding peer unix_sock. It's also what lsof outputs for the DEVICE column on a Unix socket.

Now getting the value of that peer field means being able to read kernel memory and know the offset of that peer field with regards to the unix_sock address.

Several gdb-based and systemtap-based solutions have already been given but they require gdb/systemtap and Linux kernel debug symbols for the running kernel being installed which is generally not the case on production systems.

Hardcoding the offset is not really an option as that varies with kernel version.

Now we can use a heuristic approach at determining the offset: have our tool create a dummy socketpair (then we know the address of both peers), and search for the address of the peer around the memory at the other end to determine the offset.

Here is a proof-of-concept script that does just that using perl (successfully tested with kernel 2.4.27 and 2.6.32 on i386 and 3.13 and 3.16 on amd64). Like above, it works as a wrapper around lsof:

For example:

$ that-lsof-wrapper -aUc nm-applet
COMMAND    PID     USER   FD   TYPE             DEVICE SIZE/OFF  NODE NAME
nm-applet 4183 stephane    4u  unix 0xffff8800a055eb40      0t0 36888 type=STREAM -> 0xffff8800a055e7c0[dbus-daemon,4190,@/tmp/dbus-AiBCXOnuP6]
nm-applet 4183 stephane    7u  unix 0xffff8800a055e440      0t0 36890 type=STREAM -> 0xffff8800a055e0c0[Xorg,3080,@/tmp/.X11-unix/X0]
nm-applet 4183 stephane    8u  unix 0xffff8800a05c1040      0t0 36201 type=STREAM -> 0xffff8800a05c13c0[dbus-daemon,4118,@/tmp/dbus-yxxNr1NkYC]
nm-applet 4183 stephane   11u  unix 0xffff8800a055d080      0t0 36219 type=STREAM -> 0xffff8800a055d400[dbus-daemon,4118,@/tmp/dbus-yxxNr1NkYC]
nm-applet 4183 stephane   12u  unix 0xffff88022e0dfb80      0t0 36221 type=STREAM -> 0xffff88022e0df800[dbus-daemon,2268,/var/run/dbus/system_bus_socket]
nm-applet 4183 stephane   13u  unix 0xffff88022e0f80c0      0t0 37025 type=STREAM -> 0xffff88022e29ec00[dbus-daemon,2268,/var/run/dbus/system_bus_socket]

Here's the script:

#! /usr/bin/perl
# wrapper around lsof to add peer information for Unix
# domain sockets. needs lsof, and superuser privileges.
# Copyright Stephane Chazelas 2015, public domain.
# example: sudo this-lsof-wrapper -aUc Xorg
use Socket;

open K, "<", "/proc/kcore" or die "open kcore: $!";
read K, $h, 8192 # should be more than enough
 or die "read kcore: $!";

# parse ELF header
my ($t,$o,$n) = unpack("x4Cx[C19L!]L!x[L!C8]S", $h);
$t = $t == 1 ? "L3x4Lx12" : "Lx4QQx8Qx16"; # program header ELF32 or ELF64
my @headers = unpack("x$o($t)$n",$h);

# read data from kcore at given address (obtaining file offset from ELF
# @headers)
sub readaddr {
  my @h = @headers;
  my ($addr, $length) = @_;
  my $offset;
  while (my ($t, $o, $v, $s) = splice @h, 0, 4) {
    if ($addr >= $v && $addr < $v + $s) {
      $offset = $o + $addr - $v;
      if ($addr + $length - $v > $s) {
        $length = $s - ($addr - $v);
      }
      last;
    }
  }
  return undef unless defined($offset);
  seek K, $offset, 0 or die "seek kcore: $!";
  my $ret;
  read K, $ret, $length or die "read($length) kcore \@$offset: $!";
  return $ret;
}

# create a dummy socketpair to try find the offset in the
# kernel structure
socketpair(Rdr, Wtr, AF_UNIX, SOCK_STREAM, PF_UNSPEC)
 or die "socketpair: $!";
$r = readlink("/proc/self/fd/" . fileno(Rdr)) or die "readlink Rdr: $!";
$r =~ /\[(\d+)/; $r = $1;
$w = readlink("/proc/self/fd/" . fileno(Wtr)) or die "readlink Wtr: $!";
$w =~ /\[(\d+)/; $w = $1;
# now $r and $w contain the socket inodes of both ends of the socketpair
die "Can't determine peer offset" unless $r && $w;

# get the inode->address mapping
open U, "<", "/proc/net/unix" or die "open unix: $!";
while (<U>) {
  if (/^([0-9a-f]+):(?:\s+\S+){5}\s+(\d+)/) {
    $addr{$2} = hex $1;
  }
}
close U;

die "Can't determine peer offset" unless $addr{$r} && $addr{$w};

# read 2048 bytes starting at the address of Rdr and hope to find
# the address of Wtr referenced somewhere in there.
$around = readaddr $addr{$r}, 2048;
my $offset = 0;
my $ptr_size = length(pack("L!",0));
my $found;
for (unpack("L!*", $around)) {
  if ($_ == $addr{$w}) {
    $found = 1;
    last;
  }
  $offset += $ptr_size;
}
die "Can't determine peer offset" unless $found;

my %peer;
# now retrieve peer for each socket
for my $inode (keys %addr) {
  $peer{$addr{$inode}} = unpack("L!", readaddr($addr{$inode}+$offset,$ptr_size));
}
close K;

# Now get info about processes tied to sockets using lsof
my (%fields, %proc);
open LSOF, '-|', 'lsof', '-nPUFpcfdn';
while (<LSOF>) {
  if (/(.)(.*)/) {
    $fields{$1} = $2;
    if ($1 eq 'n') {
      $proc{hex($fields{d})}->{"$fields{c},$fields{p}" .
      ($fields{n} =~ m{^([@/].*?)( type=\w+)?$} ? ",$1" : "")} = "";
    }
  }
}
close LSOF;

# and finally process the lsof output
open LSOF, '-|', 'lsof', @ARGV;
while (<LSOF>) {
  chomp;
  for my $addr (/0x[0-9a-f]+/g) {
    $addr = hex $addr;
    my $peer = $peer{$addr};
    if (defined($peer)) {
      $_ .= $peer ?
            sprintf(" -> 0x%x[", $peer) . join("|", keys%{$proc{$peer}}) . "]" :
            "[LISTENING]";
      last;
    }
  }
  print "$_\n";
}
close LSOF or exit(1);
  • Is this part of anything bigger? You seem to have put a lot of effort into this lately. – mikeserv Mar 17 '15 at 08:12
  • 1
    @mikeserv, that's a follow-up on that comment. Not being able to find the other end of unix sockets is something that has always annoyed me (often when trying to find X clients and there was a recent question about that). I'll try and see if a similar approach can be used for pseudo-terminals and suggest those to the lsof author. – Stéphane Chazelas Mar 17 '15 at 09:26
  • 1
    I still can't believe this isn't provided by the kernel itself! I should really submit a patch, if for nothing else but to discover why it doesn't already exist. – Jonathon Reinhart Mar 17 '15 at 10:10
  • 1
    does ss not do this? It's kind of over my head, but ss -px lists a lot of unix sockets with peer information like: users: ("nacl_helper",pid=18992,fd=6),("chrome",pid=18987,fd=6),("chrome",pid=18975,fd=5)) u_str ESTAB\t0\t0\t/run/dbus/system_bus_socket 8760\t\t* 15068 and the column headings are... State\tRecv-Q\tSend-Q\tLocal Address:Port\tPeer Address:Port – mikeserv Mar 17 '15 at 12:31
  • 1
    Also, if I do lsof -c terminology I can see terminolo 12731\tmikeserv\t12u\tunix\t0xffff880600e82680\t0t0\t1312426\ttype=STREAM but if I do ss -px | grep terminology I get: u_str\tESTAB\t0\t0\t* 1312426\t*1315046\tusers:(("terminology",pid=12731,fd=12)) – mikeserv Mar 17 '15 at 12:37
  • 1
    @mikeserv, it looks like it does indeed! It seems I've been wasting a lot of time lately... – Stéphane Chazelas Mar 17 '15 at 13:00
  • 1
    @mikeserv, the ss man page even has an example to list the X server clients! ss -x src /tmp/.X11-unix/* (doesn't work for me though) – Stéphane Chazelas Mar 17 '15 at 13:03
  • Hmm... So it looks like ss is printing only the Node where lsof does the device name as well.. But if I do: ss -px | grep terminology to get that 1312426 Node again, and then after - ss -px | grep 1315046 to search for the peer end it prints: u_str\tESTAB\t0\t0\t@/tmp/.X11-unix/X0 1315046\t* 1312426. It seems to be inline with the man page, which gives this example usage: ss -x src /tmp/.X11-unix/* Find all local processes connected to X server. – mikeserv Mar 17 '15 at 13:08
  • You pointed me back to that pty thing - and I remember looking at ss then (and understanding even less then) but it just kind of rung a bell. – mikeserv Mar 17 '15 at 13:09
  • The ss -x src /tmp/... does work here... I wonder what the difference would be? There's also an option for listing -memory usage, but that doesn't reveal much for me, though. Still, the reverse grep thing pulled it in too. Hmmm... maybe try the man example w/ -p as well? That seems to be a little more informative. Oh. You know - I'm probably running usermode X as well - since 1.16 (or whatever that version number was) which might make it easier for me to do introspection. Most X servers are probably still superuser. I wonder if ss doesn't want to show you root info? – mikeserv Mar 17 '15 at 13:13
  • 1
    @mikeserv, On the systems I've tried, either ss -x src /tmp/... gives me all non-listening sockets (the filter doesn't filter) or it doesn't give me the peer. I can parse the output of ss -x;ss -lx though (not reliably for sockets with spaces or newlines) – Stéphane Chazelas Mar 17 '15 at 13:33
  • If I add -e I also get these little arrows like <-> or <-- or -->. The <-- and --> (at a glance) seem to occur only alongside entries which also list filenames. Do you think that is relevant at all? Strike that - That was a pretty weak glance. But it does seem that the @ begins a pathname to a socket proper. And at least all of the pathnames appear to be fully qualified - the / slashes might be useful. – mikeserv Mar 17 '15 at 13:43
  • @mikeserv, not the best documented piece of software, is it? From the source, looks like those arrows are to indicate which direction of the socket is shutdown. – Stéphane Chazelas Mar 17 '15 at 13:47
  • Stephane, actually, it's got this huge trove of docs. It's a member of the... iproute2 suite. Ok, that was from memory and wrong. But there is this: cat /usr/share/doc/iproute2/ss.html – mikeserv Mar 17 '15 at 13:56
  • Well, I give up on it - I still don't understand what it's all about really. Anyway, this does look interesting though - its seems to list only those users with something in (Receive|Send)Q: ss -px src '/tmp/.X11*/*' | grep -v '^[^0-9]*0 *0 ' – mikeserv Mar 17 '15 at 14:18
  • I don't get it guys, if I do ss -px | grep mysql I still cannot link mysql with mysqld even if they both appear in the list (kernel 3.2). – Totor Mar 17 '15 at 14:53
  • @Totor, 3.2 would be too old. The feature was added in this commit, first released in 3.3 AFAICT. – Stéphane Chazelas Mar 17 '15 at 15:02
  • It looks like you're jumping from long to long to find the peer offset but how can you be sure that peer is on a long boundary? – Totor Mar 17 '15 at 16:19
  • @Totor, I did consider that point. My understanding would be that, since those point to structures (that contain pointers), they would be pointersize-aligned. See there and there for more details. – Stéphane Chazelas Mar 17 '15 at 16:34
  • @StéphaneChazelas thanks for the link. More accurately, it says that if you're looking for a N bytes value, you will find it on a N bytes address boundary (the compiler takes care of alignment). – Totor Mar 17 '15 at 18:14
  • @Totor that doesn't mean an 8 byte struct will be 8 byte aligned. An 8 byte long will be 8 byte aligned within the struct and such a struct at least 8 byte aligned. on the systems I've looked at, unix_sock were 64 byte aligned on 64 bit systems and 256 byte aligned on 32 bit ones – Stéphane Chazelas Mar 17 '15 at 19:34
  • 1
    @mikeserv, the ss filter issue seems to have been fixed very recently in http://thread.gmane.org/gmane.linux.network/351180 – Stéphane Chazelas Mar 17 '15 at 21:38
  • @mikeserv I tried ss -px | grep mysql with iproute 3.16 on a 3.13.1 kernel. Still can't link the client and the server sockets. Is it only working with lsof or doesn't it work at all? – Totor Mar 17 '15 at 23:54
  • @Totor - not as far as I know, but I'll tell you, you're probably not asking the right guy. I have only the faintest idea about what a socket even does. Still, though, is the mysql socket owned by you? I want to believe that ss wouldn't reveal privileged information to you. I wonder what I have that could be comparable... – mikeserv Mar 17 '15 at 23:58
  • @Totor - actually the chrome thing from earlier is probably closest - chrome's processes are run in namespaced containers - and it maintains its own database. You can get information about a running process by referencing a src socket file (maybe) and you can grep for pid. I just cross-referenced lsof because that seemed to be the thread of things here. Anyway, it might help. Also there's this crappy doc: cat /usr/share/doc/iproute2/ss.html. (the ip docs are far and away* better done)*. – mikeserv Mar 18 '15 at 00:02
  • 1
    @mikeserv sorry, I didn't have UNIX_DIAG enabled in my kernel. Now I have, and indeed, when I use ss -px, the Peer column gives me an ID (inode number?) that is the same ID as the Local column on the other side of the socket. It works. – Totor Mar 18 '15 at 01:38
  • @Totor - don't apologize - I wasn't aware until now that it was a requirement. Thank you. Still though, I'm not entirely certain it does work, or, really, what work it should do. But Stéphane seems to think it works, and, based on my past year hanging around this website, I'd say that it's probably a pretty good indicator. It's why there's no answer here from me - and there isn't likely to be. I don't really understand what is wanted - I don't know the difference really between a socket or a pty (line discipline?). I know I like to use ptys, and never had occasion to use a socket pair. – mikeserv Mar 18 '15 at 01:43
  • @mikeserv I updated my answer accordingly. – Totor Mar 18 '15 at 02:06
  • @Totor - I know. I upvoted accordingly. Thanks again, by the way. – mikeserv Mar 18 '15 at 02:09
  • That's pretty cool - looks you chunked out the last block of code to make up the majority of the top script. But didn't do you have some q/a thing on Perl's <>? Is it better now? I seem to remember it was a security thing... – mikeserv Mar 18 '15 at 12:23
  • 1
    @mikeserv, I made it a wrapper around lsof directly as it's mostly only going to be useful to process lsof output anyway. <> is a dangerous feature. It's useful in that you can do perl -ne 'some processing' 'cmd|' to do some processing on the output of cmd, but that becomes a problem when you do perl -ne 'some processing' -- * and can't guarantee file names won't end in |. Details there. Not really a concern here. – Stéphane Chazelas Mar 18 '15 at 12:30
  • @StéphaneChazelas sorry, continuing a very old chat here. :) – Totor Oct 27 '16 at 17:23
35

Since kernel 3.3, it is possible using ss or lsof-4.89 or above — see Stéphane Chazelas's answer.

In older versions, according to the author of lsof, it was impossible to find this out: the Linux kernel does not expose this information. Source: 2003 thread on comp.unix.admin.

The number shown in /proc/$pid/fd/$fd is the socket's inode number in the virtual socket filesystem. When you create a pipe or socket pair, each end successively receives an inode number. The numbers are attributed sequentially, so there is a high probability that the numbers differ by 1, but this is not guaranteed (either because the first socket was N and N+1 was already in use due to wrapping, or because some other thread was scheduled between the two inode allocations and that thread created some inodes too).

I checked the definition of socketpair in kernel 2.6.39, and the two ends of the socket are not correlated except by the type-specific socketpair method. For unix sockets, that's unix_socketpair in net/unix/af_unix.c.

  • 2
    Thanks @Gillles. I do recall reading something about that a while back, but was unable to find it again. I may just have to go writing a patch for /proc/net/unix. – Jonathon Reinhart Jul 09 '11 at 22:09
  • And yes, I'd made that observation with the increasing inode numbers, and currently that's what I'm working with. However, as you noted, it is not guaranteed. The process I'm looking at has at least 40 open unix sockets, and I saw one instance where the N+1 did not hold true. Bummer. – Jonathon Reinhart Jul 09 '11 at 22:11
  • 1
    @JonathonReinhart I checked the definition of socketpair, and the two ends of the socket are not correlated except by the type-specific socketpair method. For unix sockets, that's [unix_socketpair in `net/unix/af_unix.c](http://lxr.linux.no/#linux+v2.6.39/net/unix/af_unix.c#L1223). It would be nice to have this information for pipes, too. – Gilles 'SO- stop being evil' Jul 09 '11 at 22:35
11

Since kernel 3.3

You can now get this information with ss:

# ss -xp

Now you can see in the Peer column an ID (inode number) which corresponds to another ID in the Local column. Matching IDs are the two ends of a socket.

Note: The UNIX_DIAG option must be enabled in your kernel.

Before kernel 3.3

Linux didn't expose this information to userland.

However, by looking into kernel memory, we can access this information.

Note: This answer does so by using gdb, however, please see @StéphaneChazelas' answer which is more elaborated in this regard.

# lsof | grep whatever
mysqld 14450 (...) unix 0xffff8801011e8280 (...) /var/run/mysqld/mysqld.sock
mysqld 14450 (...) unix 0xffff8801011e9600 (...) /var/run/mysqld/mysqld.sock

There is 2 different sockets, 1 listening and 1 established. The hexa number is the address to the corresponding kernel unix_sock structure, having a peer attribute being the address of the other end of the socket (also a unix_sock structure instance).

Now we can use gdb to find the peer within kernel memory:

# gdb /usr/lib/debug/boot/vmlinux-3.2.0-4-amd64 /proc/kcore
(gdb) print ((struct unix_sock*)0xffff8801011e9600)->peer
$1 = (struct sock *) 0xffff880171f078c0

lsof | grep 0xffff880171f078c0

mysql 14815 (...) unix 0xffff880171f078c0 (...) socket

Here you go, the other end of the socket is held by mysql, PID 14815.

Your kernel must be compiled with KCORE_ELF to use /proc/kcore. Also, you need a version of your kernel image with debugging symbols. On Debian 7, apt-get install linux-image-3.2.0-4-amd64-dbg will provide this file.

No need for the debuggable kernel image...

If you don't have (or don't want to keep) the debugging kernel image on the system, you can give gdb the memory offset to "manually" access the peer value. This offset value usually differ with kernel version or architecture.

On my kernel, I know the offset is 680 bytes, that is 85 times 64 bits. So I can do:

# gdb /boot/vmlinux-3.2.0-4-amd64 /proc/kcore
(gdb) print ((void**)0xffff8801011e9600)[85]
$1 = (void *) 0xffff880171f078c0

Voilà, same result as above.

If you have the same kernel running on several machine, it is easier to use this variant because you don't need the debug image, only the offset value.

To (easily) discover this offset value at first, you do need the debug image:

$ pahole -C unix_sock /usr/lib/debug/boot/vmlinux-3.2.0-4-amd64
struct unix_sock {
  (...)
  struct sock *              peer;                 /*   680     8 */
  (...)
}

Here you go, 680 bytes, this is 85 x 64 bits, or 170 x 32 bits.

Most of the credit for this answer goes to MvG.

MvG
  • 4,411
Totor
  • 20,040
  • 2
    Another approach to retrieve the offset could be to create a socketpair, identify the corresponding entries in /proc/net/unix based on inode numbers from readlinks on /proc/pif/fd/*, and scan memory around the address of one socket for the address of the other. That could make for a reasonably portable (accross Linux versions and architectures) that could be implemented by lsof itself. I'll try to come up with a PoC. – Stéphane Chazelas Mar 13 '15 at 19:37
  • 2
    I've now added such a PoC which seems to work well on the systems I've tested. – Stéphane Chazelas Mar 17 '15 at 00:19
9

Erkki Seppala actually has a tool that retrieves this information from the Linux kernel with gdb.. It's available here.

Caleb
  • 70,105
  • 2
    Very useful information! Even if the tool didn't work out of the box for me (it caused a kernel Oops), the idea helped me to identify the other end. I described my solution on Stack Overflow. – MvG Aug 15 '12 at 20:33
5

This solution, though working, is of limited interest since if you have a recent-enough systemtap, chances are you'll have a recent-enough kernel where you can use ss based approaches, and if you're on an older kernel, that other solution, though more hacky is more likely to work and doesn't require addition software.

Still useful as a demonstration of how to use systemtap for this kind of task.

If on a recent Linux system with a working systemtap (1.8 or newer), you could use the script below to post-process the output of lsof:

For example:

$ lsof -aUc nm-applet | sudo that-script
COMMAND    PID     USER   FD   TYPE             DEVICE SIZE/OFF  NODE NAME
nm-applet 4183 stephane    4u  unix 0xffff8800a055eb40      0t0 36888 type=STREAM -> 0xffff8800a055e7c0[dbus-daemon,4190,@/tmp/dbus-AiBCXOnuP6]
nm-applet 4183 stephane    7u  unix 0xffff8800a055e440      0t0 36890 type=STREAM -> 0xffff8800a055e0c0[Xorg,3080,@/tmp/.X11-unix/X0]
nm-applet 4183 stephane    8u  unix 0xffff8800a05c1040      0t0 36201 type=STREAM -> 0xffff8800a05c13c0[dbus-daemon,4118,@/tmp/dbus-yxxNr1NkYC]
nm-applet 4183 stephane   11u  unix 0xffff8800a055d080      0t0 36219 type=STREAM -> 0xffff8800a055d400[dbus-daemon,4118,@/tmp/dbus-yxxNr1NkYC]
nm-applet 4183 stephane   12u  unix 0xffff88022e0dfb80      0t0 36221 type=STREAM -> 0xffff88022e0df800[dbus-daemon,2268,/var/run/dbus/system_bus_socket]
nm-applet 4183 stephane   13u  unix 0xffff88022e0f80c0      0t0 37025 type=STREAM -> 0xffff88022e29ec00[dbus-daemon,2268,/var/run/dbus/system_bus_socket]

(if you see 0x0000000000000000 above instead of 0xffff..., it's because the kernel.kptr_restrict sysctl parameter is set on your system which causes kernel pointers to be hidden from non-privileged processes, in which case you'll need to run lsof as root to get a meaningful result).

This script doesn't make any attempt to cope with socket file names with newline characters, but nor does lsof (nor does lsof cope with blanks or colons).

systemtap here is used to dump the address and peer address of all the unix_sock structures in the unix_socket_table hash in the kernel.

Only tested on Linux 3.16 amd64 with systemtap 2.6, and 3.13 with 2.3.

#! /usr/bin/perl
# meant to process lsof output to try and find the peer of a given
# unix domain socket. Needs a working systemtap, lsof, and superuser
# privileges. Copyright Stephane Chazelas 2015, public domain.
# Example: lsof -aUc X | sudo this-script
open STAP, '-|', 'stap', '-e', q{
  probe begin {
    offset = &@cast(0, "struct sock")->__sk_common->skc_node;
    for (i = 0; i < 512; i++) 
      for (p = @var("unix_socket_table@net/unix/af_unix.c")[i]->first;
           p;
           p=@cast(p, "struct hlist_node")->next
          ) {
        sock = p - offset;
        printf("%p %p\n", sock, @cast(sock, "struct unix_sock")->peer);
    }
    exit()
  }
};  
my %peer;
while (<STAP>) {
  chomp;
  my ($a, $b) = split;
  $peer{$a} = $b;
}
close STAP;

my %f, %addr; open LSOF, '-|', 'lsof', '-nPUFpcfdn'; while (<LSOF>) { if (/(.)(.)/) { $f{$1} = $2; if ($1 eq 'n') { $addr{$f{d}}->{"$f{c},$f{p}" . ($f{n} =~ m{^([@/].?)( type=\w+)?$} ? ",$1" : "")} = ""; } } } close LSOF;

while (<>) { chomp; for my $addr (/0x[0-9a-f]+/g) { my $peer = $peer{$addr}; if (defined($peer)) { $_ .= $peer eq '0x0' ? "[LISTENING]" : " -> $peer[" . join("|", keys%{$addr{$peer}}) . "]"; last; } } print "$_\n"; }

3

4.89 of lsof supports displaying endpoint options.

Quoted from lsof.8:

+|-E +E specifies that process intercommunication channels should be
     displayed with endpoint information and the channels
     of the endpoints should also be displayed.  Currently
     only pipe on Linux is implemented.

     Endpoint information is displayed in the NAME column
     in the form "PID,cmd,FDmode".  PID is the endpoint
     process ID; cmd is the endpoint process command; FD is
     the endpoint file's descriptor; and mode is the
     endpoint file's access mode.  Multiple occurrences of
     this information can appear in a file's NAME column.

     -E specfies that Linux pipe files should only be
     displayed with endpoint information.

Example of output:

mozStorag 21535 22254  yamato    6u     unix 0xf...       0t0     348924 type=STREAM pino=351122 4249,dbus-daem,55u
mozStorag 21535 22254  yamato   10u     unix 0xf...       0t0     356193 type=STREAM pino=356194 21535,gdbus,11u
mozStorag 21535 22254  yamato   11u     unix 0xf...       0t0     356194 type=STREAM pino=356193 21535,gdbus,10u
mozStorag 21535 22254  yamato   21u     unix 0xf...       0t0     355141 type=STREAM pino=357544 4249,dbus-daem,60u
mozStorag 21535 22254  yamato   26u     unix 0xf...       0t0     351134 type=STREAM pino=355142 5015,gdbus,17u
mozStorag 21535 22254  yamato   69u     unix 0xf...       0t0     469354 type=STREAM pino=468160 4545,alsa-sink,21u
mozStorag 21535 22254  yamato   82u     unix 0xf...       0t0     449383 type=STREAM pino=449384 12257,Chrome_Ch,3u
mozStorag 21535 22254  yamato   86u     unix 0xf...       0t0     355174 type=SEQPACKET pino=355175 21535,gdbus,95u
mozStorag 21535 22254  yamato   95u     unix 0xf...       0t0     355175 type=SEQPACKET pino=355174 21535,gdbus,86u 12257,Chrome_Ch,4u
mozStorag 21535 22254  yamato  100u     unix 0xf...       0t0     449389 type=STREAM pino=456453 3614,Xorg,38u
mozStorag 21535 22254  yamato  105u     unix 0xf...       0t0     582613 type=STREAM pino=586261
obexd     22163        yamato    1u     unix 0xf...       0t0     361859 type=STREAM pino=365931
obexd     22163        yamato    2u     unix 0xf...       0t0     361860 type=STREAM pino=365934
obexd     22163        yamato    3u     unix 0xf...       0t0     361241 type=DGRAM pino=10028
obexd     22163        yamato    6u     unix 0xf...       0t0     361242 type=STREAM pino=361864 4249,dbus-daem,70u
3

Since Linux kernel 4.2 there exists CONFIG_UNIX_DIAG, which provides extra information about UNIX domain sockets, namely the Virtual File System (VFS) information, which contains the so far missing information to link the Inode from the path to the process. It can already be queried using the ss tool from iproute2 starting with version v4.19.0~55:

$ ss --processes --unix --all --extended
...
Netid  State   Recv-Q  Send-Q  Local Address:Port      Peer Address:Port
u_str  LISTEN  0       5         /tmp/socket 13381347             * 0     users:(("nc",pid=12550,fd=3)) <-> ino:1569897 dev:0/65025 peers:

The device number and path Inode you can get from

$ stat -c 'ino:%i dev:0/%d' /tmp/socket
ino:1569946 dev:0/65025

ss also supports filtering:

 ss --processes --unix --all --extended 'sport = /tmp/socket'

but please be aware that this might not list the right socket for you, as an evil process might rename your original socket and replace it with it own evil one:

mv /tmp/socket /tmp/socket.orig
nc -U -l /tmp/socket.evil &
mv /tmp/socket.evil /tmp/socket

lsof /tmp/socket, fuser /tmp/socket and ss --processes --unix --all --extended 'sport = /tmp/socket' will list only the original process, not the evil replacement. Instead use something like this:

id=$(stat -c 'ino:%i dev:0/%d' /tmp/socket)
ss --processes --unix --all --extended | grep -F "$id"

Or write your own little program based on the template contained in man 7 sock_diag.

Phil Hord
  • 233
pmhahn
  • 125
  • 1
  • 5
1

I'd use px.

Disclaimer: I wrote it, so of course I'm recommending it.

px will tell you which other processes yours is talking to.

Example output, scroll to the bottom for sockets tracing:

~ $ sudo px 49903
/Applications/Google Chrome.app/Contents/MacOS/Google Chrome
  --enable-audio-service-sandbox
  --origin-trial-disabled-features=MeasureMemory

kernel(0) root launchd(1) root --> Google Chrome(49903) johan Google Chrome Helper(49922) johan Google Chrome Helper(49958) johan Google Chrome Helper (GPU)(49920) johan Google Chrome Helper (Renderer)(49935) johan Google Chrome Helper (Renderer)(49936) johan Google Chrome Helper (Renderer)(49943) johan Google Chrome Helper (Renderer)(49950) johan Google Chrome Helper (Renderer)(49951) johan Google Chrome Helper (Renderer)(49957) johan Google Chrome Helper (Renderer)(64466) johan Google Chrome Helper (Renderer)(75275) johan Google Chrome Helper (Renderer)(76225) johan Google Chrome Helper (Renderer)(76650) johan Google Chrome Helper (Renderer)(76668) johan Google Chrome Helper (Renderer)(76712) johan

7d21h ago Google Chrome was started, at 2020-09-04T19:15:03+02:00. 0.3% has been its average CPU usage since then, or 32m25s/7d21h

Other processes started close to Google Chrome(49903): Google Chrome/chrome_crashpad_handler(49912) was started just after Google Chrome(49903) AlertNotificationService(49924) was started 1.0s after Google Chrome(49903) Google Chrome Helper(49922) was started 1.0s after Google Chrome(49903) Google Chrome Helper (GPU)(49920) was started 1.0s after Google Chrome(49903) Google Chrome Helper (Renderer)(49935) was started 1.0s after Google Chrome(49903) Google Chrome Helper (Renderer)(49936) was started 1.0s after Google Chrome(49903) VTDecoderXPCService(49934) was started 1.0s after Google Chrome(49903)

Users logged in when Google Chrome(49903) started: johan

2020-09-12T16:28:30.587930: Now invoking lsof, this can take over a minute on a big system... 2020-09-12T16:28:30.901834: lsof done, proceeding.

Others sharing this process' working directory (/) Working directory too common, never mind.

File descriptors: stdin : [CHR] /dev/null stdout: [CHR] /dev/null stderr: [CHR] /dev/null

Network connections: [IPv4] : (LISTEN)

Inter Process Communication: mDNSResponder(291): [unix] ->0x2b8028c5de1ab5b mDNSResponder(291): [unix] ->0x2b8028c5de1c5eb

For a list of all open files, do "sudo lsof -p 49903", or "sudo watch lsof -p 49903" for a live view. ~ $