AWS EFS Overview
Amazon Elastic File System (EFS) is a scalable file system for Linux-based workloads that can be used in addition to other AWS cloud services or on-premise resources. AWS EFS is one of three main storage solutions that Amazon offers.
It is designed to allow thousands of Amazon EC2 instances to share parallel access to files for high aggregate throughput and IOPS, with a maximum performance of 10GB/s and 500k IOPS. EFS is a fully managed service that you can integrate easily and control through a standard file system interface while AWS handles the deployment, patching, and maintenance of the framework. It includes Standard and Infrequent Access storage classes to optimize storage costs and is pay-for-use with no need to provision storage ahead of time.
EFS can use existing security infrastructures and file access can be controlled with POSIX permissions, Amazon VPC, or AWS IAM. It meets a range of compliance requirements without the need for customization.
Use Cases
EFS is most useful in cases of:
- Big Data Analytics—due to the scalability of size and parallel shared access
- Web Serving and Content Management—due to adherence to the standardized directory structure, naming conventions and permissions
- Lift-and-Shift—due to familiarity and compatibility of structure and standard file system interface
Optimizing EFS Performance
How you can get the most of out AWS EFS is dependent on your purposes but the following tips can be applied to most situations.
#1. Monitor Your Metrics
It is difficult to know if your configuration is optimized if you are not monitoring it but knowing your system is flawed and not knowing how to fix it is worse. EFS sends metric data to Amazon CloudWatch every minute by default which can then be accessed through the CloudWatch console, the CLI, or through integration with the API.
Whatever method you choose, use the data to verify the performance you are getting and analyze it to optimize your settings. In particular, you should be monitoring your burst credit balance as sudden or consistent drops can forewarn of stalled operations and reduced productivity.
#2. Enable Asynchronous Write
Enabling asynchronous writes will buffer pending operations to your EC2 instances before writing them on EFS. This reduces latency by avoiding the need for a round trip from client to EFS for each write operation.
#3. Maximize Your Limits
Since EFS doesn’t use instances, the only way to control I/O limits is by selecting either General Performance, which gives low latency and high horizontal scalability, or Max I/O, which handles larger data transfer volumes in exchange for higher latency. The default setting is General Performance and this is the one recommended by Amazon for most cases.
New EFS volumes start with a .5MB/s transfer rate and 7.2 minutes of burst credits good for 100MB/s, which is probably not enough for your needs. The only way to increase these numbers is to increase your use of a volume.
Rather than wait until your volume grows naturally, you can force this increase by writing large or numerous dummy files to each new volume you add. These will take up storage space, which you can free up later if you need, but they are the best way to quickly get the boost you need. Keep in mind, when you create backups of your volumes not to include these files to avoid paying for unnecessary storage.
#4. Avoid Running Applications
EFS is not good for application deployment or managing codebases; it is not designed for the large file read volumes or fast speed that these require. A better solution is to use tools that deploy code to local filesystems or containers. EFS is meant to act as massively shared storage, so stick to using it for media assets, exported data files, asynchronous logs, etc.
#5. Use Multiple Volumes
You can maximize EFS performance by separating your latency-sensitive operations into additional volumes. This protects them from being slowed down by operations that are not time-sensitive by granting them a seperate throughput cap and burst credits. Be aware, this will still not be as fast as if files were locally mounted but it will provide more consistent performance than if you were to leave them mixed.
#6. Select Proper Mount Options
In general, you should not be adjusting the default mount options that AWS sets. If you are sure that you’ll get better performance if you do, and have verified this through testing and benchmarking, take care to mount using the DNS name. Doing so will ensure that your data is mounted in the same Availability Zone as your EC2 instance, and help you avoid surcharges.
EFS supports both Network File System (NFS) version 4.0 and 4.1 protocols. You’ll get better performance form NFSv4.1 so use it if possible and consider increasing the size of your NFS client’s read/write buffers when mounting your file system to speed things up.
When mounting, keep in mind that EFS file systems can be attached to thousands of EC2 instances concurrently. The more instances you mount a file system on, the higher your aggregate throughput.
#7. Optimize Your Backups
EFS does not offer snapshot capabilities so you will need to back up your volumes using AWS Backup, through orchestration with Lambda, or via third-party integration. Be careful in configuring these, no matter which you choose, as you can easily consume your burst credits if you don’t rate-limit your copying or if you copy large amounts of data all at once.
Wrap Up
EFS is one of several services offered by AWS to meet your file storage needs. If you want to be able to make use of a host of EC2 instances or are looking to share files across a wide network, it can be a good solution for you.
If you decide that it is, make sure you are getting the most out of the services you’re purchasing by carefully configuring your set-up and monitoring its status. The optimizations covered here can get you started and help ensure your productivity and that of your team.