There are three OS level changes you can put in place to tweak the performance of your EAE node beyond the built in settings provided through the documentation.
This document is focused on Jive's best practices around tuning EAE for higher performance. Filesystem tuning has been omitted. EAE nodes are assumed to be running on NFS mounts. In general, block I/O is not your limiting factor in EAE stability in any case.
vm.max_map_count - default is 200000, set in /etc/sysctl.conf, set initially on system install
fs.file-max - default is auto-tuned by OS
nofile - soft 100000, hard 200000 for all users set in /etc/security/limits.conf
Note that additional actions need to be taken and processes restarted for updated values to take effect. These actions should include running "sysctl -p" to reload the new settings from the sysctl.conf, in addition restarting EAE processes.
Details - Settings
The number of memory maps a single process can have open at once. It doesn't restrict the size of those mapped areas, just the number. The system will accept any arbitrary number for this value, even you can't physically allocate that much memory on the system. Assuming the default setting, and assuming you're mapping the smallest amount per area (which would be a single memory page), that gives a single process the ability to map (200000 * 4096) KB of memory, which works out to ~800MB. If you have 4GB of memory for your EAE node then this is conservative.
You can see how many maps a particular process has open by running a line count:
# wc -l /proc/<pid>/maps
Where <pid> is the PID of the EAE service on the eae node.
The absolute maximum # of open files the system can have at once. On a 2.6 series kernel this is dynamically set based on memory at 10% of total memory size. So for a 4GB system, that would be ~400K. You can see the current values by running the following command:
# cat /proc/sys/fs/file-nr
Where the first number is the # of allocated files, the second you can ignore for this purpose, and the third is the max # of open files in the system. You can manually set the max # to any arbitrary number, although it is recommended to keep this in bounds to ensure you do not set this to a value that exceeds the amount of configured memory.
The number of files any one <user> can have open at once across all logins + processes. This is typically set in /etc/security/limits.d/<file>, which will override anything in /etc/security/limits.conf. This value cannot exceed 1M binary, or you will have issues with the system. This typically manifests itself by not being able to login to the machine using any method, as it's unable to parse values larger than that number when doing its PAM checks and things start to fail. The "hard" limits are typically set to 2 x "soft" limits, but they can be identical. Recommended default is 100K/200K for soft and hard limits, respectively.
Things to avoid
- set fs.file-max - let the system determine that. It uses 10% as a baseline for a reason
- set nofile to larger than 1M binary, or more than fs.file-max
- set vm.max_map_count to a value that would allow it to allocate more than (e.g. total system memory - 2GB)
General best practices
- hard nofile should always be < fs.file-max. Leaving *at least* a 2K gap so the system can still open files when fully loaded is wise
- soft nofile should always be <= hard nofile
- vm.max_map_count should leave at least 2GB of free system memory even if the process maxed out the maps
- If you're increasing nofile to 500K+, or vm.max_map_count to 1M+, monitor the system to ensure you don't need to add more CPU resources
Note: values are the maximum (decimal, not binary) recommended
4GB total system memory
vm.max_map_count - 500K
nofile - 400K
6GB total system memory
vm.max_map_count - 1M
nofile - 600K
8GB total system memory
vm.max_map_count - 1.5M
nofile - 800K
10GB total system memory
vm.max_map_count - 2M
nofile - 1M
For anything larger than 10GB, you can increase the vm.max_map_count if necessary, but do not increase the nofile settings past 1M decimal.