2

I know that a cache is used to improve the speed of reading data from hard drive because the speed of reading data from hard drive and the speed of reading data from RAM are very different. So the index of cache miss is very important.

Also, I know that we have the buffer (for example, there are BufferedWriter and BufferedReader in Java) while reading from and writing to the hard drive or other devices such as keyboard. But I don't know why we need a buffer. What if we have no buffer? Is the buffer also for improving reading speed? If so, what is the difference between buffer and cache?

Besides, I know that cache is to improve the reading speed but can cache improve the speed of writing data from RAM to hard drive?

Olorin
  • 4,656
Yves
  • 3,291

1 Answers1

3

A buffer, from what I understand, is most useful when the rates at which a producer and consumer optimally produces or consumes data are different.

For example, a program may write 8 bytes of data to a file at a time. For the disk that is being written to, it may be optimal to actually write 4 KiB of data at a time. For significantly smaller chunks, the overhead involved in writing to the disk may become larger than the time taken by the disk to actually write the data (so that 512*T_8b >> T_4KiB). So, having a buffer in between gather up 4 KiB chunks of data and write them at a go would greatly increase performance. See, for example: Why dd takes too long?, where the simplest solution is to use a larger buffer size. (Of course, my numbers here are purely for example. Actual numbers suited for modern disks may be much different.)

A cache is orthogonal to buffering. Both caching and buffering are done to prevent unnecessary extra accesses of a slow source (disk, network, etc.). But a cache works by eliminating duplicated reads and writes by saving the results for reuse, whereas a buffer works by eliminating repeated, sequential reads and writes by bunching them up. A simplistic view:

  • when you read the same location n times, only the first actually hits the disk, the rest come from the cache
  • when you write to the same location n times, only the last is actually written
Olorin
  • 4,656
  • Great, one more question about their difference: where is buffer? As I know, cache does exist, I mean cache is a kind of hardware, we can see them set between hard drive and RAM by eyes, there are L1, L2, L3 caches, but where do we put buffer? it is all in RAM? or it is set as virtual memory? or it is also in L1, L2, L3 caches? – Yves Mar 13 '18 at 06:12
  • 1
    @Yves those are not the only caches. Caches are also in memory. Correspondingly buffers can also be in hardware (the disk controller likely has a small buffer, as do, for example, many RAID controllers: https://serverfault.com/a/325250/) – Olorin Mar 13 '18 at 06:19