Openstack nova terminating guest vms with oom_kill

user956952

6/28/23, 8:54 PM

I am running an openstack Victoria with Kolla ansible deployment , all components are containerised .

The compute node is (oom_kill) killing guest when the memory is max out , is there a way to avoid it like in other hypervisors it works fine without this issue . I am using Centos 8.3 . Please let me know if there is a way to avoid this .

Errors :

**Feb 27 12:18:15 server1 kernel: neutron-openvsw invoked oom-killer: gfp_mask=0x6200ca(GFP_HIGHUSER_MOVABLE), order=0, oom_score_adj=0
Feb 27 12:18:15 server1 kernel: oom_kill_process.cold.28+0xb/0x10
Feb 27 12:18:15 server1 kernel: [ pid ]   uid  tgid total_vm      rss pgtables_bytes swapents oom_score_adj name
Feb 27 12:18:15 server1 kernel: oom-kill:constraint=CONSTRAINT_NONE,nodemask=(null),cpuset=395bde13c7e0570ef36df008bc028d8701fd76c1b56e2a56afaf254fd53d0043,mems_allowed=0-1,global_oom,task_memcg=/machine/qemu-33-instance-000000dc.libvirt-qemu,task=qemu-kvm,pid=2301214,uid=42436
Feb 27 12:18:17 server1 kernel: oom_reaper: reaped process 2301214 (qemu-kvm), now anon-rss:0kB, file-rss:516kB, shmem-rss:0kB**

sar memory utilisation

==================================
10:10:05 AM kbmemfree   kbavail kbmemused  %memused kbbuffers  kbcached  kbcommit   %commit  kbactive   kbinact   kbdirty
12:00:05 PM    877228         0 393690660     99.78         0    500284 2254123104    542.46 374227828  12705256         0
12:10:04 PM    866416         0 393701472     99.78         0    501844 2254259520    542.49 374233440  12704360         0
12:20:04 PM 301182096 300028052  93385792     23.67         0    705140 1938778932    466.57  83794716   5028804         8
12:30:04 PM 301085624 299970968  93482264     23.69         0    779220 1939000988

0 + 0

openstack

openstack-nova

eblock

7/1/23, 8:44 AM

This sounds like [this thread](http://lists.openstack.org/pipermail/openstack-discuss/2022-February/027350.html) in the openstack-discuss mailing list.

Score:0

Server

user956952

7/29/23, 3:17 PM

Answering my own questioun as I found a resolution .

The oom kills were happening even when free stats were looking good like on a 256G RAM only 140G was used and still around 100G shows up as free .

[root@serverxx ~]# free -g total used free shared buff/cache available Mem: 251 140 108 0 2 108 Swap: 19 6 13

oom kills were triggered by high %commit in the sar stats where the kernel starts targetting instances with high memory footprint to free up .

To avoid oom kills for the guest instances with higher memory footprints , I set the following . vm.oom_kill_allocating_task=1

When I did a sar -r the %commit was way higher than the system can allocate and I figured from ps that it was a cinder-backup container that was created by default from kolla-ansible deployments but was not configured .

Cinder backup service stats that I didn't configure and it was just running , it turned out that the unconfigured container was taking up all the memory overtime as seen from the output of ps command in the vsz .

ps -eo args,comm,pid,ppid,rss,vsz --sort vsz column

VSZ is extremely high

COMMAND COMMAND PID PPID RSS VSZ /usr/libexec/qemu-kvm -name qemu-kvm 1916998 47324 8094744 13747664 /var/lib/kolla/venv/bin/pyt cinder-backup 43689 43544 170999912 870274784

Sar stats for % commit coming back to normal after the backup container was stopped and now everything is back to normal . %commit highlighted from 1083.46 to 14.21 after the changes .

02:00:37 PM kbmemfree kbavail kbmemused %memused kbbuffers kbcached kbcommit %commit kbactive kbinact kbdirty 03:00:37 PM 48843576 49998184 82890508 62.92 9576 5949348 1427280428 1083.46 75646888 2797388 324 03:10:37 PM 48829248 49991284 82904836 62.93 9576 5956544 1427343664 1083.50 75653556 2804592 116 03:20:22 PM 120198612 121445516 11535472 8.76 9576 6042892 18733688 14.22 4887688 2854704 80 03:30:37 PM 120189464 121444176 11544620 8.76 9576 6050200 18725820 14.21 4887752 2862248 88

0 + 0

Elon Musk

I sit in a Tesla and translated this thread with Ai:

EN: Openstack nova terminating guest vms with oom_kill

TH: Openstack nova ยุติ vms ของแขกด้วย oom_kill

RO: Openstack nova care termină vms invitat cu oom_kill

RU: Openstack nova завершает гостевую виртуальную машину с помощью oom_kill

VI: Openstack nova chấm dứt vms khách với oom_kill

Post an answer

Most people don’t grasp that asking a lot of questions unlocks learning and improves interpersonal bonding. In Alison’s studies, for example, though people could accurately recall how many questions had been asked in their conversations, they didn’t intuit the link between questions and liking. Across four studies, in which participants were engaged in conversations themselves or read transcripts of others’ conversations, people tended not to realize that question asking would influence—or had influenced—the level of amity between the conversationalists.