Last Updated:

Improving EFS Performance

Jenade Moodley
Jenade Moodley EFS

As discussed in the part 1 of this blog post, the latency which occurs with EFS is per-file operation latency due to the nature of the NFS protocol, specifically with handling metadata. In order to improve performance of your application, you need to understand the nature of your application, i.e. is your application performing mainly read or write operations. There are a few ways to determine this, which is discussed in the NFS troubleshooting blog post, however an easy way to determine the activity of your EFS on your EC2 instance is to use the mountstats command.

$ mountstats /mnt/efs
Stats for 127.0.0.1:/ mounted on /mnt/efs:
...

NFS byte counts:
  applications read 404148 bytes via read(2)
  applications wrote 1966484148 bytes via write(2)
  applications read 0 bytes via O_DIRECT read(2)
  applications wrote 0 bytes via O_DIRECT write(2)
  client read 0 bytes via NFS READ
  client wrote 1966484148 bytes via NFS WRITE

RPC statistics:
  599130 RPC requests sent, 599130 RPC replies received (0 XIDs not found)
  average backlog queue length: 0

GETATTR:
        126739 ops (21%)
        avg bytes sent per op: 313      avg bytes received per op: 240
        backlog wait: 0.005097  RTT: 1.268323   total execute time: 1.311964 (milliseconds)
SETATTR:
        122786 ops (20%)
        avg bytes sent per op: 360      avg bytes received per op: 260
        backlog wait: 0.004838  RTT: 4.192335   total execute time: 4.237853 (milliseconds)
LOOKUP:
        115717 ops (19%)
        avg bytes sent per op: 364      avg bytes received per op: 268
        backlog wait: 0.023964  RTT: 2.415324   total execute time: 2.497403 (milliseconds)
ACCESS:
        72755 ops (12%)
        avg bytes sent per op: 320      avg bytes received per op: 168
        backlog wait: 0.008604  RTT: 1.881768   total execute time: 1.959151 (milliseconds)
REMOVE:
        40000 ops (6%)
        avg bytes sent per op: 346      avg bytes received per op: 116
        backlog wait: 0.003750  RTT: 7.256800   total execute time: 7.280375 (milliseconds)
OPEN:
        30225 ops (5%)
        avg bytes sent per op: 455      avg bytes received per op: 441
        backlog wait: 0.016973  RTT: 5.637519   total execute time: 5.663623 (milliseconds)
CLOSE:
        30006 ops (5%)
        avg bytes sent per op: 336      avg bytes received per op: 176
        backlog wait: 0.010231  RTT: 0.626241   total execute time: 0.643738 (milliseconds)
WRITE:
        30002 ops (5%)
        avg bytes sent per op: 65893    avg bytes received per op: 176
        backlog wait: 0.012932  RTT: 18.157989  total execute time: 18.181355 (milliseconds)
RENAME:
        30000 ops (5%)
        avg bytes sent per op: 528      avg bytes received per op: 152
        backlog wait: 0.007867  RTT: 5.331867   total execute time: 5.347367 (milliseconds)
READDIR:
        867 ops (0%)
        avg bytes sent per op: 342      avg bytes received per op: 32059
        backlog wait: 0.003460  RTT: 6.455594   total execute time: 6.489043 (milliseconds)
OPEN_NOATTR:
        5 ops (0%)
        avg bytes sent per op: 389      avg bytes received per op: 275
        backlog wait: 0.000000  RTT: 2.800000   total execute time: 2.800000 (milliseconds)
SERVER_CAPS:
        3 ops (0%)
        avg bytes sent per op: 316      avg bytes received per op: 164
        backlog wait: 0.000000  RTT: 0.333333   total execute time: 0.333333 (milliseconds)
FSINFO:
        2 ops (0%)
        avg bytes sent per op: 312      avg bytes received per op: 152
        backlog wait: 0.000000  RTT: 0.500000   total execute time: 0.500000 (milliseconds)
LOOKUP_ROOT:
        1 ops (0%)
        avg bytes sent per op: 184      avg bytes received per op: 380
        backlog wait: 0.000000  RTT: 0.000000   total execute time: 0.000000 (milliseconds)
PATHCONF:
        1 ops (0%)
        avg bytes sent per op: 308      avg bytes received per op: 116
        backlog wait: 0.000000  RTT: 0.000000   total execute time: 0.000000 (milliseconds)
SECINFO_NO_NAME:
        1 ops (0%)
        avg bytes sent per op: 172      avg bytes received per op: 104
        backlog wait: 0.000000  RTT: 5.000000   total execute time: 6.000000 (milliseconds)

As you can see, my EFS is performing write heavy operations, and you can even see the RPC calls made most often to the EFS, the majority of which is GETATTR and SETATTR. These are for setting metadata attributes, indicating that my workload is also metadata based.

Depending on the type or workload you have, i.e. read intensive or write intensive workloads, there are different options you can configure in order to enhance the performance of your application using EFS.

Improving read operation performance

As we discussed, latency experienced on EFS is due to metadata access sequence of the NFS protocol. We cannot change how files are accessed from the EFS, however, we know that the local disk used is faster than EFS. Therefore, if we want to improve access time for files, we can cache the files on the local disk, effectively reducing latency by taking the NFS protocol out of the equation. This can be done using a file system cache (or fs-cache), which was created for this very purpose. A popular file system cache we can use is cachefilesd. To show the effects of using a cache, we can perform a test in which we access a file before and after using the cache to observe the performance improvements.

Setting up cachefilesd

We first need to install cachefilesd using the standard package manager of your distribution.

Amazon Linux, CentOS, RHEL
$ sudo yum install cachefilesd -y
Ubuntu
$ sudo apt-get update
$ sudo apt-get install cachefilesd

After installing the package, we need to start the cachefilesd daemon.

Systemd
$ sudo systemctl start cachefilesd
Upstart
$ sudo service cachefilesd start

We then need to instruct the NFS client to use the cache when accessing the EFS by specifying the fsc mount option when mounting the EFS.

Using the EFS Mount Helper
$ sudo mount -t efs -o fsc <EFS_ID> <MOUNT_POINT>
Using the NFS client
$ sudo mount -t nfs4 -o fsc,nfsvers=4.1,rsize=1048576,wsize=1048576,hard,timeo=600,retrans=2,noresvport <EFS_DNS>:/ <MOUNT_POINT>

Testing the cache

To test the cache, we first need to create a small file on the EFS which we will be accessing.

$ dd if=/dev/urandom of=/mnt/efs/cache_file bs=4k count=128

To ensure that the cache is not in effect, we can first clear the pagecache, dentries and inodes using /proc/sys/vm/drop_caches

$ sudo echo 3 | sudo tee /proc/sys/vm/drop_caches && sudo sync

We can then test an uncached access to the file we initially created.

$ time dd if=/mnt/efs/cache_file of=$HOME/local_file bs=4k count=128
128+0 records in
128+0 records out
524288 bytes (524 kB) copied, 0.0176215 s, 29.8 MB/s

real    0m0.027s
user    0m0.002s
sys     0m0.000s

First time accessing the file from the EFS took 0.027 seconds, at a speed of 29.8 MB/s. If we access this file again, this file will be served from the cache.

$ time dd if=/mnt/efs/cache_file of=$HOME/local_file bs=4k count=128
128+0 records in
128+0 records out
524288 bytes (524 kB) copied, 0.00155109 s, 338 MB/s

real    0m0.009s
user    0m0.002s
sys     0m0.000s

The time taken to access the file was 0.009 seconds at a speed of 338 MB/s. This is 3 times faster than the initial access time. While the access time may seem small, as the latency occurs per-file operation, this time will drastically increase for larger number of files in deeper directories.

Limitations and Caveats

By definition, the operating system will check the cache first before accessing the EFS. This means that, should a file not exist, the entire cache will be checked before accessing the EFS. This can induce performance issues if the cache is very large and new files are being accessed more frequently than older files. This can be mitigated by configuring more options on cachefilesd such as cache culling, which involves discarding objects from the cache that have been used less recently than other objects. This is explained further in the cachefilesd man pages.

Improving write operation performance

While using cachefilesd can speed up access to an existing file, it does not help much for write operations as the file may not yet exist (in the case of new files), and the NFS protocol would still need to access the metadata in order to perform updates such as updating the last modified time, and update the block size. In order to get around this limitation, we can take advantage of the resilient nature of EFS. 

EFS is designed to be accessed by thousands of clients simultaneously, and emphasis is placed on throughput of data written to the file system from all clients. This means that EFS is more situated for parallel workloads, or rather it is with parallel workloads that EFS really shines. As such, we can improve write performance by performing our commands in parallel. This can be done using two methods:

  1. Use multiple clients (i.e. use multiple EC2 instances). 
  2. Use a tool such as GNU parallel or fpsync to run the commands in parallel.

The first option is the more recommended option as it may be difficult to rebuild your application to execute operations in parallel. However, for regular commands run on multiple files such as cp, mv, or rsync would benefit greatly if configured to operate on files in parallel. By default, these commands perform serial operations, i.e. it would perform the action on one file at a time. Performing the operations on multiple files simultaneously with a single command would greatly improve the performance. To demonstrate the results, as well as the ease of using a parallel utility, we can perform the below test using GNU parallel. 

Installing GNU Parallel

The commands to install GNU parallel would differ per distribution. For RPM-based distributions (RedHat, CentOS, Amazon Linux), it is available in the epel repository. For Ubuntu, it is available in the default repository. The commands to install GNU parallel would be as below:

RHEL 7
$ sudo yum install -y https://dl.fedoraproject.org/pub/epel/epel-release-latest-7.noarch.rpm
$ sudo yum install parallel -y
Amazon Linux, RHEL 6
$ sudo yum install -y https://dl.fedoraproject.org/pub/epel/epel-release-latest-6.noarch.rpm
$ sudo yum install parallel -y
Amazon Linux 2
$ sudo amazon-linux-extras install epel
$ sudo yum install parallel -y
CentOS 6, CentOS 7
$ sudo yum install epel-release
$ sudo yum install parallel -y
Ubuntu
$ sudo apt-get update
$ sudo apt-get install parallel

Running commands in parallel

We first need to create a large number of small files on our local disk. In my test, I have created 10000 files. For reference, I am using a standard t2.large EC2 instance running Amazon Linux 2, with a general purpose (gp2) volume of 8 GB in size.

$ mkdir /tmp/efs; for each in $(seq 1 10000); do SUFFIX=$(mktemp -u _XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX); sudo dd if=/dev/zero of=/tmp/efs/${SUFFIX} bs=64k count=1; done

We can then copy these files to the EFS with rsync using the standard serial method, and measure the time it takes to copy the files.

$ cd /tmp/efs; time sudo find -L . -maxdepth 1 -type f -exec rsync -avR '{}' /mnt/efs/ \;
...
real    9m11.494s
user    0m34.186s
sys     0m10.303s

The results show that it took longer than 9 minutes to copy 10000 files to the EFS using rsync. If we purge the directory, and use rsync in parallel:

$ cd /tmp/efs; time find -L . -maxdepth 1 -type f | sudo parallel rsync -avR {} /mnt/efs/
...
real    5m41.274s
user    1m1.279s
sys     0m37.556s

The files took longer than 5 minutes, effectively reducing the copy time by almost half. Furthermore, GNU parallel runs as many parallel jobs as there are CPU cores. As the t2.large instance contains 2 cores, this explains why the time was almost halved. However, GNU parallel allows you to specify the number of jobs by using the--jobs flag. If we set the --jobs flag to 0, it would try to run as many jobs as possible with the amount of CPU available.

$ cd /tmp/efs; time find -L . -maxdepth 1 -type f | sudo parallel --jobs 0 rsync -avR {} /mnt/efs/
...
real    3m25.578s
user    1m2.320s
sys     0m45.163s

Using as many jobs as possible, we reduced the time taken to about a third of the time it would have taken when using the serial copy method. You can also edit and configure GNU parallel commands as your needs. You can find a more detailed tutorial of GNU parallel here.

Conclusion

Knowing your application's usage patterns can greatly assist you with configuring your EFS. Using the above suggestions correctly can help you to use the EFS to it's full potential, however incorrect use of these suggestions, such as implementing cachefilesd for an application which only accesses recent data on the EFS, can cause further performance degradation. There are additional caching options you can look at such as PHP-opcache for PHP-based websites, and incorporating a CDN such as CloudFront, however these are website specific caching mechanisms. You need to identify a good mix of services and caching mechanisms in order to improve your application performance on EFS, and I hope this blog post gives you a good starting point on improving your experience with EFS.

Comments