we have Hadoop cluster with 265 Linux RHEL machines.
from total 265 machines, we have 230 data nodes machines with HDFS filesystem.
total memory on each data-node is 128G and we run many spark applications on these machines.
last month we added another spark applications, so process takes more memory from data-nodes machines.
we noticed that cache. memory is very important part, and when more process are running on machines, then the right conclusion is to add more RAM memory.
since we can't do memory upgrade to 256G on next 5-6 month, then we are thinking about how to improve the performance of the RHEL machine and memory cash as possible.
from our experience, memory Casch is very important for applications stability.
one option is to clear the RAM memory cache and buffer as the following.
1. Clear PageCache only.
# sync; echo 1 > /proc/sys/vm/drop_caches
2. Clear dentries and inodes.
# sync; echo 2 > /proc/sys/vm/drop_caches
3. Clear PageCache, dentries and inodes.
# sync; echo 3 > /proc/sys/vm/drop_caches
and run them from the cron as following. ( from https://www.wissenschaft.com.ng/blog/how-to-clear-ram-memory-cache-buffer-and-swap-space-on-linux/ )
#!/bin/bash
# Note, we are using "echo 3", but it is not recommended in production instead use "echo 1"
echo "echo 3 > /proc/sys/vm/drop_caches"
Set execute permission on the clearcache.sh file.
# chmod 755 clearcache.sh
Now you may call the script whenever you required to clear ram cache.
Now set a cron to clear RAM cache everyday at 2am. Open crontab for editing.
# crontab -e
Append the below line, save and exit to run it at 2am daily.
0 2 * * * /path/to/clearcache.sh
but since we are talking on production data-nodes machines, then I am not so sure that above settings are safety, and they give (?) some solution until we can increase the memory from 128G to 256G
can we get yours ideas about what I wrote?
and if the "Clear RAM Memory Cache" is the right temporary solution until memory upgrade