one of the importance things in Kafka production cluster is the page cache
here is a good explains what is page cache
PageCache is a typical read/write cache. The operating system uses the free physical memory to cache files. This cache is called PageCache. When an application program writes a file, the operating system writes data to the PageCache first. After the data is successfully written to the PageCache, the writing is complete for user code.
The operating system then asynchronously updates the data to the file on disk. When an application program is reading a file, the operating system tries to search for data from PageCache. If the data is found, the operating system directly returns the data. If the data cannot be found, a page fault interrupt is triggered. Then, the operating system reads the data from the disk file to PageCache and returns the data to the application program.
After data is written to PageCache, it is not written to disk at the same time. There is a delay in the process. The operating system can ensure that the operating system will synchronize the data to the disk even if an application is unexpectedly quit. However, if the server suddenly loses power, that data is lost.
The read and write cache design is inherently unreliable and sacrifices data consistency for performance. Of course, the application program can invoke system calls such as sync to force the operating system to immediately synchronize the cached data to the disk file. However, the synchronization process is very slow and the cache function is lost.
the kernel OS Linux parameters that are relevant to page cache are:
cat /proc/vmstat | egrep "dirty|writeback"
nr_dirty 50376
nr_writeback 4673
nr_writeback_temp 0
nr_dirty_threshold 1746633
nr_dirty_background_threshold 1982726
I am wonder , what is the best way to optimize Page Cache kernel parameters
or to find somewhere article that explain what is the best practice to set the page cache parameters