11

My situation is that from time to time a specific process (in this case, it's Thunderbird) doesn't react to user input for a minute or so. I found out using iotop that during this time, it writes quite a lot to the disk, and now I want to find out which file it writes to, but unfortunately iotop gives only stats per process and not per open file(-descriptor).

I know that I can use lsof to find out which files the process has currently open, but of course Thunderbird has a lot of them open, so this is not that helpful. iostat only shows statistics per device.

The problem occurs only randomly and it might take quite some time for it to appear, so I hope I don't have to strace Thunderbird and wade through long logs to find out which file has the most writes.

2 Answers2

6

If you attach strace to the process just when it's hung (you can get the pid and queue the command up in advance, in a spare terminal), it'll show the file descriptor of the blocking write.

Trivial example:

$ mkfifo tmp
$ cat /dev/urandom > tmp &
[1] 636226
  # this will block on open until someone opens for reading

$ exec 4<tmp
  # now it should be blocked trying to write

$ strace -p 636226
Process 636226 attached - interrupt to quit
write(1, "L!\f\335\330\27\374\360\212\244c\326\0\356j\374`\310C\30Z\362W\307\365Rv\244?o\225N"..., 4096 <unfinished ...>
^C
Process 636226 detached
Useless
  • 4,800
3

I you have root access, I think the best tool would be the audit subsystem. There isn't much literature about it (but more than about loggedfs); you can start with this tutorial or a few examples or just with the auditctl man page. Here, it should be enough to make sure the daemon is started, then run auditctl as root:

auditctl -a exit,always -F pid=1234 -F dir=/home/philipp

This will write to logs in /var/log/audit/audit.log every time the process with pid 1234 writes somewhere under /home/philipp. The overhead is fairly small, a lot smaller than strace.