Tuesday, July 31, 2012

Cache Issues in the I/O Performance Test

Sometime you have to benchmark IO performance such as file read/write. If it was the first time you did such kind of task, you almost probability notice that the reading speed is very high, hundreds of MB per second, with a slow hard disk, such as mechanical one with 5400rpm. That is the effects of file caching.

I will show you here some techniques to overcome or minimize the cache effects.

1. Re-mount


The easiest way is to umount and re-mount the corresponding mount point. For example,

# umount /home
# mount /home

2. Clear Cache of the OS


On Linux, you can clear or drop cache of the OS by using the following command.
# sync && echo 3 > /proc/sys/vm/drop_caches

The former part of the above command will commit buffer cache to disk, and the latter part will tell OS to drop buffer caches immediately.

There are three levels of dropping cache with corresponding numbers.
1 - Free pagecache
2 - Free dentries and inodes.
3 - Free pagecache, dentries, and inodes.

3. Use Direct I/O for POSIX


When you open a file with the flag O_DIRECT, you bypass the I/O buffers and therefore bypass the cache effects of cache at the operating system level. Almost device drivers support POSIX compatible API also support O_DIRECT flag, except of some parallel distributed file systems such as PanFS.

Benchmark tools such as IOR (-B), IOzone (-I), dd (direct) have flag for this feature. In programming, use the following example.

open(filename, O_DIRECT);

4. Clear Cache for a Specific File


If you want to clear cache for a specific file, you can use the following program clearcache.c to tell OS not to cache the file. It actually helps you clear cache of a list of files provided via program's arguments.

/* clearcache.c - mrcuongnv */

#include <stdio.h>
#include <sys/types.h>
#include <sys/stat.h>
#include <fcntl.h>
#include <errno.h>
#include <string.h>

int
clear_file_cache(filename)
     char *filename;
{
  int fd, rs;

  printf("%s", filename);
  if ((fd = open(filename, O_RDONLY)) != -1) {
    if ((rs = posix_fadvise(fd, 0, 0, POSIX_FADV_DONTNEED)) == 0) {
      printf(" --> Cleared\n");
      return 0;
    }
  }
  printf(" --> %s (%d)\n", strerror(errno), errno);
  return 1;
}

int
main(argc, argv)
     int argc;
     char *argv[];
{
  if (argc < 2) {
    fprintf(stderr, "Syntax: %s FILES\n", argv[0]);
    return 1;
  }

  int i, rs = 0;
  for (i = 1; i < argc; i++) {
    rs += clear_file_cache(argv[i]);
  }

  return rs;
}

The most important part in the above code is posix_fadvise(), which tells OS that the specified data will not be accessed in the near future.

1 comment:

lena said...

Thank you for bringing up the issue of Cache in I/O performance tests. This is a very important topic that we need to be aware of when conducting these tests. It's really helpful that you pointed this out and I appreciate your effort in bringing it to our attention.

In order to keep ourselves updated with the latest ideas and developments, I would like to suggest that we create a JT Whatsapp group where we can discuss these topics and share our thoughts with each other. This will help us stay connected and informed about new upcoming ideas, including topics like cache in I/O performance tests. Would you like to create the group, or should I assist you with it?