2

Is CentOS using synchronous or asynchronous write? Is ther any way to check or to change this?

vpJohan
  • 41

3 Answers3

4

By default, all writes are asynchronous.

You can configure them to be synchronous at the application level O_DIRECT|O_SYNC open(2) flags, or at the file system one (-o sync option of the mount command).

jlliagre
  • 61,204
2

According to Red Hat's (rather old) page 12.5. Verifying Asynchronous I/O Usage, asynchronous I/O is supported using libaio. Applications either are, or are not linked with that library. There is nothing mentioned about enabling or disabling: applications simply use the library. The page says you can verify usage by inspecting /proc/slabinfo.

In my CentOS 6 machine, with 2466 files in /usr/bin, only 3 are linked with libaio:

  • btreplay
  • qemu-img
  • qemu-io

There are programs which use this feature, but not many. Some people confuse this with the buffer cache.

Further reading:

I/O operations in UNIX and Linux systems typically go through the file system cache. Although this doesn't represent a problem in itself, this extra processing does require resources. Bypassing the file system cache reduces CPU requirements, and frees up the file system cache for other non-database file operations. Operations against raw devices automatically bypass the file system cache.

Thomas Dickey
  • 76,765
  • I think the OP is referring to the fact that in Linux, writes are not all immediately flushed to the underlying device but set as dirty pages which are flushed to disk (using pdflush) via the IO scheduler, which itself may introduce another delay (unless noop scheduler is used). I am not sure what the default page-flushing values are with CentOS 6, but there could be a 30 second delay. See http://www.makelinux.net/books/lkd2/ch15lev1sec4 – Otheus Apr 22 '16 at 12:14
0

As /u/jiliagre mentioned, you cat force O_SYNC with the file-open or at mount-time via -o sync flag, per device (you can remount most open mountpoints with mount -o remount,sync <mtpt>. Alternatively, you can tell the system to immediately schedule a flush every time it does a write (which "dirties a page").

When sync mode is not enabled, the complicated writeback algorithms come into play. The writeback algorithm is designed to limit IO operations. It assumes that the user prefers the system perform flushes to disk only occasionally: either after a deadline or when there are a threshold of "dirty" pages to flush. To do the flushing, it simply assigns some work to and "wakes up" the kernel thread(s) pdflush.

There are a few system variables you can manipulate via sysctl and /proc/ to control writeback's behavior. You are primarily interested in:

vm.dirty_background_ratio     = 0

which essentially forces every dirty write to wake up and be flushed; and/or

vm.dirty_writeback_centisecs  = 1

which ensures dirty pages are flushed every 1/100th of a second. You might be tempted to set the latter to 0, but that will actually disable the timer. You might also be tempted to set vm.dirty_ratio to 0, but at least in 2.6.25, this is lower-capped at 5% and will not help you here.

Caveat #1: The writeback algorithm employs rate-limiting behavior. I'm not completely sure of the algorithm, as it's complicated, but if so many pages are to be flushed to disk, pdflush will be called on a set number, and then pdflush will be scheduled in the next second. I think if subsequent to the first chunk sent to be pdflushed, another page is dirtied, then another chunk will immediately be scheduled before the 1-second timer.

Caveat #2: it still won't be completely synchronous, because pdflush sends its data to the block-level IO scheduler which pushes the write operation into a "queue". But I'm not sure this is any different than what you get with sync-mode.

PS: Don't forget to look at and upvote this beautiful answer.

Otheus
  • 6,138