5

If the runqueue is the number of processes waiting for their turn on the CPU + processes currently running, and waitqueue is the number of processes waiting for I/O, then wouldn’t B in the vmstat output being greater than R mean that there is an I/O bound, not CPU bound? I am confused because the link below says the opposite ... From http://nonfunctionaltestingtools.blogspot.com/2013/03/vmstat-output-explained.html?m=1

“If runnable threads (r) divided by the number of CPU is greater than one -> possible CPU bottleneck (The (r) coulmn should be compared with number of CPUs (logical CPUs as in uptime) if we have enough CPUs or we have more threads.) High numbers in the blocked processes column (b) indicates slow disks. (r) should always be higher than (b); if it is not, it usually means you have a CPU bottleneck”

sourcejedi
  • 50,249
John Alvarez
  • 51
  • 1
  • 2

2 Answers2

4

A higher number in b than in r means the CPUs are often idle, so you are right being confused. The document should have read 'means you have an I/O bottleneck'.

Beware that the page says r should never be higher than the number of CPUs, and r=16 on a 12 CPU system is a "serious" problem. This is quite exaggerated. That just means CPUs are fully used and some threads are waiting. Usually no big deal.

Finally, don't confuse threads and processes, like the linked document sometimes does too. The r and b columns show number of threads, not processes. Not all processes are single threaded.

jlliagre
  • 61,204
0

I think that sentence is entirely confused.

When r > num_cpus, it makes sense to think of the system as a whole as CPU-bound (at that exact instant).

However, I don't think r > b has any special significance.

Another source suggests "if there is a non-zero number in [the b] column constantly, you can investigate further with iostat." It probably makes more sense to consider this condition as suggesting an IO-bound system, unless you know you have multiple IO queues in use.

iostat includes a %util column for each disk device. I.e. if %util is 100, it suggests there is always at least one process waiting on that device. avgqu-sz will show how many different requests are waiting at once.

Programs which use AIO may submit more than one request at a time. This is mostly used by databases, for example MySQL InnoDB. Most programs do not use Linux AIO, because it does not support caching.

sourcejedi
  • 50,249