Score:1

Is Linux or something else eating my RAM? Memory usage doesnt add up

py flag

I've read quiet a few topics on RAM usage here on serverfault and linuxatemyram.com, but none seem to provide insight into what's happening on my machine

    top - 17:42:31 up 8 days, 10:23,  3 users,  load average: 1.16, 1.14, 1.19
Tasks: 344 total,   1 running, 343 sleeping,   0 stopped,   0 zombie
%Cpu(s): 13.3 us,  2.4 sy,  0.0 ni, 83.4 id,  0.0 wa,  0.0 hi,  0.8 si,  0.0 st
MiB Mem :  15888.2 total,    600.2 free,  14782.2 used,    505.9 buff/cache
MiB Swap:   8192.0 total,   6647.9 free,   1544.1 used.    673.3 avail Mem

    PID USER      PR  NI    VIRT    RES    SHR S  %CPU  %MEM     TIME+ COMMAND
3959357 howie     20   0 9290724   3.6g 166112 S  44.2  23.2 817:31.05 java
1479873 howie     20   0   20.7g   4.8g  12140 S  14.6  31.1 881:40.81 java
   1513 root      20   0       0      0      0 S   4.7   0.0 395:04.70 napi/eth%d-385
   1516 root      20   0       0      0      0 S   0.3   0.0  48:38.33 napi/eth%d-386
   2548 unifi     20   0 7867536 526756   4416 S   0.3   3.2  36:21.41 java
   2713 root      20   0  493312   2216   1312 S   0.3   0.0   3:25.22 X
   3285 sddm      20   0 1326628  14908   4932 S   0.3   0.1  10:37.51 .sddm-greeter-w
1239489 root      20   0       0      0      0 I   0.3   0.0   0:00.35 kworker/2:2-events
1332415 howie     20   0   11232   2844   2152 S   0.3   0.0   0:01.09 top
      1 root      20   0  168780   5732   3132 S   0.0   0.0   7:48.99 systemd

My top process use about 55-60% of of memory. Having 16/15.5G I would expect there would be about 6G free or available. top and free indicate i have about 541Mb-1Gb free. (top screenshot was taken a few minutes later...)

              total        used        free      shared  buff/cache   available
Mem:          15888       14477         915          34         495         983
Swap:          8191        1544        6647

Does anybody have some suggestions how to find out if something is eating my RAM?

cat /proc/meminfo
MemTotal:       16269540 kB
MemFree:          469556 kB
MemAvailable:     681988 kB
Buffers:               0 kB
Cached:           579676 kB
SwapCached:       188676 kB
Active:          7583844 kB
Inactive:        2554720 kB
Active(anon):    7248276 kB
Inactive(anon):  2346652 kB
Active(file):     335568 kB
Inactive(file):   208068 kB
Unevictable:       20576 kB
Mlocked:               0 kB
SwapTotal:       8388604 kB
SwapFree:        6809504 kB
Dirty:               364 kB
Writeback:             0 kB
AnonPages:       9559824 kB
Mapped:           419488 kB
Shmem:             36040 kB
KReclaimable:      79620 kB
Slab:             770456 kB
SReclaimable:      79620 kB
SUnreclaim:       690836 kB
KernelStack:       16696 kB
PageTables:        62508 kB
NFS_Unstable:          0 kB
Bounce:                0 kB
WritebackTmp:          0 kB
CommitLimit:    16523372 kB
Committed_AS:   14439856 kB
VmallocTotal:   34359738367 kB
VmallocUsed:      306496 kB
VmallocChunk:          0 kB
Percpu:             2512 kB
AnonHugePages:      2048 kB
ShmemHugePages:        0 kB
ShmemPmdMapped:        0 kB
FileHugePages:         0 kB
FilePmdMapped:         0 kB
CmaTotal:              0 kB
CmaFree:               0 kB
HugePages_Total:       0
HugePages_Free:        0
HugePages_Rsvd:        0
HugePages_Surp:        0
Hugepagesize:       2048 kB
Hugetlb:               0 kB
DirectMap4k:    12745260 kB
DirectMap2M:     3907584 kB

dmesg shows oom-killer at work (just showing one of multiple entries...)

dmesg | grep oom-killer
[174151.082274] dockerd invoked oom-killer: gfp_mask=0x140cca(GFP_HIGHUSER_MOVABLE|__GFP_COMP), order=0, oom_score_adj=0

Thanks

vn flag
https://www.linuxatemyram.com/
Howard Ching Chung avatar
py flag
I think there might be something else going. I'm confused why memory used is so high while the top processes combined don't even come close to the total memory in use.
vn flag
Please, read it. "Linux is borrowing unused memory for disk caching." "If your applications want more memory, they just take back a chunk that the disk cache borrowed."
vn flag
Your `dockerd invoked oom-killer` is a red herring, too; that means a container hit its memory limit, not the host's. https://docs.docker.com/config/containers/resource_constraints/#limit-a-containers-access-to-memory
Benyamin Limanto avatar
th flag
@HowardChingChung Better check it using ps_mem https://github.com/pixelb/ps_mem to check the memory usage.
Howard Ching Chung avatar
py flag
Thanks both. seems like java uses a lot of shared memory. I would have expected, after reading linuxatemuram.com, that this memory would fall under ' available' when using free -m.
Score:0
cn flag

meminfo

Yes, your meminfo reports 9.1 GB AnonPages of 15.5 GB total.

Small amounts of Shmem and file pages. This is consistent with an application doing big private allocations, and indeed you are running some java. Note java does not tend to directly allocate shared memory for JVM purposes. Contrast to databases, which often have shared memory segments, and of course do lots of file I/O.

No, you do not get all of the remainder to use. MemAvailable is the most useful estimate of this, and your is relatively low at 0.7 GB, or 4% of MemTotal. Upstream's notes on what this is:

MemAvailable

An estimate of how much memory is available for starting new applications, without swapping. Calculated from MemFree, SReclaimable, the size of the file LRU lists, and the low watermarks in each zone. The estimate takes into account that the system needs some page cache to function well, and that not all reclaimable slab will be reclaimable, due to items being in use. The impact of those factors will vary from system to system.

So, actually free memory, and a guess at easy to reclaim pages from file cache and kernel objects. Easy stuff. New memory allocations beyond this size may do terribly slow direct reclaim, or anger the oom killer.

Some very difficult to reclaim items from meminfo are SUnreclaim, KernelStack, and PageTables, totaling maybe 0.8 GB. Not unreasonable, I tend to assume a non-trivial sized host needs at least a GB or two for the kernel.

Still leaves 5 GB or so not easily bucketed. I assume it was in use at some point, things changed like processes terminating, and the kernel has yet to finish the work of reclaiming it as free.

Capacity planning estimates

Do some rough, back-of-the-napkin estimates of how much memory this application can use. Based on your experience with it, and approximately how big its working size is.

Java's memory use and the knobs to limit it are well documented. Here a Red Hat developer blog takes a stab at it. Heap is most of it, but there is a bit more: JVM memory = Heap memory+ Metaspace + CodeCache + (ThreadStackSize * Number of Threads) + DirectByteBuffers + Jvm-native

Also note the suggestion to round up. "production heap size should be at least 25% to 30% higher than the tested maximum to allow room for overhead."

Containers

I find it odd that dockerd and other processes are ones to hit oom killer, and apparently not the big java processes. Admittedly, the oom killer is guessing based on arbitrary heuristics.

Containers, via cgroups, can set memory use limits and tuning on different groups. Also enables accurate reporting of memory use per cgroup, a good tool on systemd systems is systemd-cgtop -m

Review all memory related configuration for containers, for docker see resource constraints documentation. (Although at least one thing on that docker page is incorrect: swappiness is not a percentage.)

Reliably not running low of memory host wide may mean setting per container memory limits. If they contain say some java threads, set the maximum to more than the estimate from capacity planning.

Guessing here, but maybe --oom-kill-disable is on a java container. Under memory pressure but the big consumer is protected, oom killer might remove other system processes on the host. I would in some ways prefer it take out the big java thread, as a hint to the administrator JVM memory might need to be compared to available memory.

Overall, this host could use a bit more memory. Could make do with what it has, but would need more capacity planning, a closer look at what's running and what limits their resource use.

I sit in a Tesla and translated this thread with Ai:

mangohost

Post an answer

Most people don’t grasp that asking a lot of questions unlocks learning and improves interpersonal bonding. In Alison’s studies, for example, though people could accurately recall how many questions had been asked in their conversations, they didn’t intuit the link between questions and liking. Across four studies, in which participants were engaged in conversations themselves or read transcripts of others’ conversations, people tended not to realize that question asking would influence—or had influenced—the level of amity between the conversationalists.