2

I'm trying to understand why I have a huge number of EBS IO request on my Amazon EC2 micro instance. I launched the instance roughly 6 days ago and has racked up over 4 million IO requests so far. The instance came pre-loaded with a LAMP stack (Virtualmin, PHP, Apache, MySQL). I installed one Wordpress site and only loaded it in a browser a few times to do some testing.

How to determine what is generating all these IO requests possibly using iotop or some other Linux utility?

Rui F Ribeiro
  • 56,709
  • 26
  • 150
  • 232
Joe M
  • 23

3 Answers3

2

I'd add the following 3 tools into the mix as well. Assuming you have them installed, if not you should be able to install them via whatever repository is provided to your ec2 instance.

The high load is likely being caused by either disk or network I/O so I'd focus on those 2 areas to start.

nethogs

Networking would be my first suspicion, to diagnose that further, I'd use nethogs to see what processes are causing it.

Example

Determine your network interface, so you can tellnethogs which one to watch.

$ ip link show up | awk '/UP/ {print $2}'
lo:
em1:
wlp3s0:
virbr0:

In my case I'm going to watch my wireless device, wlp3s0.

$ sudo nethogs wlp3s0
NetHogs version 0.8.0

  PID USER     PROGRAM                      DEV        SENT      RECEIVED       
2151  saml     /opt/google/chrome/chrome    wlp3s0     2.117       2.715 KB/sec
3569  saml     ..4/thunderbird/thunderbird  wlp3s0     0.441       1.496 KB/sec
3144  saml     ..aml/.dropbox-dist/dropbox  wlp3s0     0.081       0.061 KB/sec
3383  saml     pidgin                       wlp3s0     0.026       0.056 KB/sec
4025  saml     ssh                          wlp3s0     0.000       0.000 KB/sec
?     root     unknown TCP                             0.000       0.000 KB/sec

  TOTAL                                                2.665       4.327 KB/sec 

Looking at the output we can see that chrome is using the bulk of my bandwidth.

iftop

You can see if the traffic is coming from a specific set of sites using iftop.

                195kb           391kb           586kb           781kb      977kb
└───────────────┴───────────────┴───────────────┴───────────────┴───────────────
greeneggs.bubba.net        => stackoverflow.com          4.68kb  10.2kb  8.24kb
                           <=                            33.5kb  14.7kb  21.4kb
greeneggs.bubba.net        => ord08s12-in-f8.1e100.net      0b   3.90kb  3.99kb
                           <=                               0b   3.61kb  3.72kb
greeneggs.bubba.net        => ord08s10-in-f16.1e100.net  5.05kb  4.10kb  5.83kb
                           <=                            2.43kb  2.39kb  2.79kb
greeneggs.bubba.net        => stackoverflow.com          1.32kb  3.34kb  4.73kb
                           <=                            1.30kb  1.60kb  2.30kb
greeneggs.bubba.net        => cpe-67-253-170-83.rochest     0b   2.19kb   760b
                           <=                               0b   2.60kb   862b
greeneggs.bubba.net        => pop1.biz.mail.vip.ne1.yah  5.87kb  1.17kb   301b
                           <=                            17.4kb  3.47kb   889b
greeneggs.bubba.net        => 190.93.247.58               480b   2.04kb  2.66kb
                           <=                               0b   1.34kb  1.80kb
greeneggs.bubba.net        => ig-in-f95.1e100.net         448b   1.02kb  1.27kb
                           <=                             240b    437b    534b
greeneggs.bubba.net        => ord08s12-in-f2.1e100.net    896b    346b    218b
                           <=                             480b    221b    124b
────────────────────────────────────────────────────────────────────────────────
TX:             cum:    652kB   peak:   85.2kb  rates:   20.6kb  29.3kb  30.1kb
RX:                     883kB            161kb           57.9kb  31.4kb  40.6kb
TOTAL:                 1.50MB            241kb           78.5kb  60.7kb  70.7kb

fatrace

You can use the tool fatrace to see what processes are causing accesses to the HDD.

$ sudo fatrace
pickup(4910): O /var/spool/postfix/maildrop
pickup(4910): C /var/spool/postfix/maildrop
sshd(4927): CO /etc/group
sshd(4927): CO /etc/passwd
sshd(4927): RCO /var/log/lastlog
sshd(4927): CWO /var/log/wtmp
sshd(4927): CWO /var/log/lastlog
sshd(6808): RO /bin/dash
sshd(6808): RO /lib/x86_64-linux-gnu/ld-2.15.so
sh(6808): R /lib/x86_64-linux-gnu/ld-2.15.so
sh(6808): O /etc/ld.so.cache
sh(6808): O /lib/x86_64-linux-gnu/libc-2.15.so

What else?

I'd take a look at this Unix & Linux Q&A that I answered a while ago for more tools to try. It's titled: Determining Specific File Responsible for High I/O.

Follow up questions from comments

Q1: Does bandwidth shown by nethogs count against IO requests in AWS? I thought that would fall under 'data transfer' which is a separate category. In iotop the biggest percentage usage was root and a command called 'kswapd0'. mysqld had the biggest disk write usage and httpd had the most disk read

I have no idea how this actually is tracked by Amazon. These values are from the perspective of the VM host so they may not correlate even remotely to what Amazon is tracking your VMs usage from their perspective.

By the way, this kswapd0 is likely the source of your high IO requests. This is thrashing because, most likely your VM doesn't have enough RAM to satisfy the size/usage of the applications you're running in the VM. So to try and meet the need your system is resorting to making use of swap.

You can confirm this a bit more via the free command.

Example

$ free -ht
             total       used       free     shared    buffers     cached
Mem:          7.6G       5.5G       2.1G         0B       446M       2.5G
-/+ buffers/cache:       2.6G       5.0G
Swap:         7.6G        40K       7.6G
Total:         15G       5.5G       9.7G

This shows you how much RAM & swap are in use by your system.

Q2: Oh and one follow up question. How does MB or KB of disk read/write in iotop relate to number of IO requests? For example if mysqld wrote 20 M to disk, is there any easy way to know how many IO requests that generated?

There isn't really any correlation that I'm aware of with respect to the number of IO read/writes and the aggregate amount of data read/written to disk.

Given you're using AWS your actual disk read/writes may very well not even be to a local disk, they could be to storage over the network (SoE - aka. SCSI over Ethernet for example).

Your VM would be completely oblivious to this, since the SoE setup would likely be done at the host level and then exposed as disks to any VMs running on the host.

References

slm
  • 369,824
  • First off thanks for the very detailed info @slm. I ran nethogs and iotop for a while and did some normal activity on the site like loading pages and some editing in the WP backend. From nethogs the biggest user of bandwidth was sshd with only a few KB coming from apache httpd.

    Question: Does bandwidth shown by nethogs count against IO requests in AWS? I thought that would fall under 'data transfer' which is a separate category.

    In iotop the biggest percentage usage was root and a command called 'kswapd0'. mysqld had the biggest disk write usage and httpd had the most disk read

    – Joe M Dec 10 '13 at 18:17
  • Oh and one follow up question. How does MB or KB of disk read/write in iotop relate to number of IO requests? For example if mysqld wrote 20 M to disk, is there any easy way to know how many IO requests that generated? – Joe M Dec 10 '13 at 18:30
  • @JoeM - see updates, tried to address Q's from the comments in the A. – slm Dec 10 '13 at 19:03
  • When I run swap -t (no h switch in my version) swap always shows 0. Is this because it's just showing an instantaneous snapshot and not a running total? This instance only has 616 MB of memory so I guess it would make sense that swapping is occurring. Is there a way to limit/optimize how much swapping happens or do I just need more memory? – Joe M Dec 10 '13 at 20:48
  • @JoeM - I would ask this all as a new question, rather then gunk up this one with too much info. – slm Dec 10 '13 at 20:55
  • New question = http://unix.stackexchange.com/questions/104739/is-swapping-the-cause-of-high-io-on-my-box – Joe M Dec 11 '13 at 18:30
0

As you said, iotop is a good utility for the task, you may also want to take a look at theses tools:

  • lsof to see the files per process.
  • dstat give a live report on all of the system activity
  • sar can give you a good history of your system activity
slm
  • 369,824
slariviere
  • 66
  • 5
0

(My anecdotal take)

I suspect that the Wordpress has generated the spam traffic. Wordpress sites are known for their spam-attracting tendencies if you do not configure them.

Furthermore, it is possible that spambots are configured to launch other sort of attacks on the instance.

Have you configured the instance to keep all ports closed?

Ketan
  • 9,226