0

I am running some very basic time commands on an S3 read/write. The problem is, I dont want it to be affected by system IO, and want to bench it from memory. A friend has suggested to use /dev/null as pipe, but I have a folder of 1000 files which is about 1GB in size.

My bash command looks like this right now :

time aws s3 cp folder s3://mybucket/folder

What do you suggest, that will time only the write from memory?

Many thanks

Stat.Enthus
  • 101
  • 1

1 Answers1

0

Copy the directory to a ramdisk, upload (and test) from there.

What ramdisk? There are few options:

  1. Maybe there is one in your OS already and its size is big enough. Check the output of df (more conveniently: df -h, if supported) for filesystems of the type tmpfs. It may be /dev/shm. I expect /dev/shm to be world-writable, so if it's a ramdisk and it's not too small then you can use it.

  2. There may be a separate ramdisk strictly for your user. See What is this folder /run/user/1000? In my Debian or Kubuntu df -h "$XDG_RUNTIME_DIR" allows me to confirm it's tmpfs, learn its size and mountpoint.

  3. This answer gives an example how to create tmpfs on demand (of the size 16 GiB, adjust it to your needs and resources):

    mount -o size=16G -t tmpfs none /mnt/tmpfs
    

    You need root access to do this.

There is this question: What sets the size of tmpfs? What happens when its full? You may find it interesting.


As an alternative you can use vmtouch to read all files in your directory and map into virtual memory. If I were you, I would prefer a ramdisk. Still, if for any reason you cannot use a ramdisk, then it's good to know what vmtouch can do.

As a regular user you can use vmtouch -t:

-t
Touch virtual memory pages. Reads a byte from each page of the file. If the page is not resident in memory, a page fault will be generated and the page will be read from disk into the file system's memory cache.

-t may be not enough:

Note: Although each page is guaranteed to have been brought into memory, the page might be evicted from memory by the time the vmtouch command completes.

As root you can use vmtouch -l:

-l
Lock pages into physical memory. This option works the same as -t except it calls mlock(2) on all the memory mappings and doesn't close the descriptors when finished. At the end of the crawl, if successful, vmtouch will block indefinitely. The files will be locked in physical memory until the vmtouch process is killed.

  • Thanks! I am running on a VM, is it possible to create this ramdisk from root or I need the VM admin – Stat.Enthus Oct 26 '21 at 22:02
  • @Stat Treat the VM as if it (you) didn't know it's virtual and just try. As root you should be able to create a ramdisk. The problem is different. Obviously you need some amount of free RAM to use as ramdisk. If the machine is virtual, it's ram is also virtual. It may happen the VM "thinks" it has plenty of memory, but on the hypervisor level there is a shortage and data that the VM puts in its RAM physically goes to a swapfile of the host machine. This defeats your purpose. Being the admin of the host and in charge of the hypervisor helps, as you can increase chances the VM really uses RAM. – Kamil Maciorowski Oct 27 '21 at 04:46