I have an EC2 t2.micro instance that I use as a test instance for a web app,
OS: Ubuntu 22.04.1 LTS
the CPU consumption of kswapd0 the process jumps near 100% once in a while, causes a freezes of the instance, and it needs to be rebooted.
Note: "/proc/sys/vm/swappiness" is at default value of 60
I have read that kswapd0 is the process that swaps memory from physical to virtual, so in theory, it should only kick in when memory usage is high.
The mystery is that NO process seems to be going on a memory binge, I have logged resource usage with atop, and below are 3 atop snapshots, take 1 minutes appart.
Here is a summary of the snapshots, we see the free mem going low, and cpu usage going high:
atop snapshot1: cpu 5%, free mem 242.9M
atop snapshot2: cpu 38%, free mem 242.5M
atop snapshot3: cpu 99%, free mem 65.4M
Here are the full snapshots, we can see other variables, on snapshot 2,3 the DSK,MEM and CPU lines are in read (color not shown here...)
Also, the process list in the snapshot is sorted by memory usage, I would expect to see a culprit process with high memory usage, but no process is consuming more than 8%.
ATOP - ip-172-31-3-66 2022/10/18 00:00:01 ----------------- 1d15h10m16s elapsed
PRC | sys 2m27s | user 6m23s | | | #proc 148 | | #trun 12 | #tslpi 138 | | #tslpu 44 | #zombie 0 | | clones 92804 | | | #exit 13 |
CPU | sys 0% | | user 5% | irq 0% | | | idle 94% | | wait 0% | | steal 0% | guest 0% | | | curf 2.40GHz | |
CPL | avg1 0.01 | | avg5 0.10 | | avg15 0.15 | | | | csw 12835910 | | intr 5959928 | | | | numcpu 1 | |
MEM | tot 966.2M | free 242.9M | cache 302.3M | dirty 1.4M | buff 9.4M | slab 59.9M | slrec 26.0M | | | shmem 138.0M | shrss 0.0M | shswp 0.0M | | | | numnode 1 |
SWP | tot 0.0M | | free 0.0M | | swcac 0.0M | | | | | | | | vmcom 1.1G | | vmlim 483.1M | |
PAG | scan 6306350 | steal 4335e3 | | stall 0 | compact 0 | numamig 0 | | migrate 13e5 | | | | | swin 0 | swout 0 | | oomkill 0 |
PSI | cpusome 4% | memsome 0% | | memfull 0% | iosome 0% | | iofull 0% | cs 0/1/3 | | ms 0/0/0 | mf 0/0/0 | | is 0/0/0 | if 0/0/0 | | |
DSK | xvda | busy 0% | | read 219927 | write 497224 | discrd 0 | | KiB/r 33 | KiB/w 7 | KiB/d 0 | | MBr/s 0.1 | MBw/s 0.0 | avq 1.51 | | avio 0.57 ms |
DSK | xvda1 | busy 0% | | read 219453 | write 497222 | discrd 0 | | KiB/r 33 | KiB/w 7 | KiB/d 0 | | MBr/s 0.1 | MBw/s 0.0 | avq 1.51 | | avio 0.57 ms |
DSK | xvda15 | busy 0% | | read 222 | write 2 | discrd 0 | | KiB/r 42 | KiB/w 0 | KiB/d 0 | | MBr/s 0.0 | MBw/s 0.0 | avq 0.68 | | avio 1.43 ms |
DSK | xvda14 | busy 0% | | read 55 | write 0 | discrd 0 | | KiB/r 46 | KiB/w 0 | KiB/d 0 | | MBr/s 0.0 | MBw/s 0.0 | avq 0.64 | | avio 4.22 ms |
NET | transport | tcpi 1240508 | | tcpo 1316373 | udpi 148741 | udpo 154146 | | tcpao 18388 | tcppo 10357 | tcprs 4413 | | tcpie 294 | tcpor 3684 | udpnp 5399 | | udpie 0 |
NET | network | | ipi 1400059 | ipo 1361002 | | ipfrw 0 | deliv 1400e3 | | | | | | | icmpi 5402 | icmpo 54 | |
NET | lo ---- | | pcki 1152470 | pcko 1152470 | | sp 0 Mbps | si 158 Kbps | so 158 Kbps | | | coll 0 | mlti 0 | erri 0 | erro 0 | drpi 0 | drpo 0 |
NET | eth0 ---- | | pcki 291231 | pcko 213817 | | sp 0 Mbps | si 8 Kbps | so 14 Kbps | | | coll 0 | mlti 0 | erri 0 | erro 0 | drpi 0 | drpo 0 |
*** System and Process Activity since Boot *** Rawfile view
PID TID MINFLT MAJFLT VSTEXT VSLIBS VDATA VSTACK LOCKSZ VSIZE RSIZE PSIZE VGROW RGROW SWAPSZ RUID EUID MEM CMD 1/9
30705 - 29496 291 2.7M 10.2M 90.1M 132.0K 0.0K 241.5M 78.0M 71.4M 241.5M 78.0M 0B flask flask 8% gunicorn
30704 - 28672 137 2.7M 10.2M 87.9M 132.0K 0.0K 239.3M 75.6M 69.1M 239.3M 75.6M 0B flask flask 8% gunicorn
170 - 12586 441557 92.0K 8.8M 49.2M 132.0K 0.0K 227.3M 44.1M 34.2M 227.3M 44.1M 0B root root 5% systemd-journa
30741 - 58604 984 5.8M 16.5M 7.1M 132.0K 0.0K 215.5M 33.5M 14.7M 215.5M 33.5M 0B postgres postgres 3% postgres
30739 - 49786 259 5.8M 16.5M 7.7M 132.0K 0.0K 216.1M 32.2M 14.1M 216.1M 32.2M 0B postgres postgres 3% postgres
212 - 5069 13 80.0K 6.2M 18.0M 132.0K 282.5M 282.5M 26.7M 22.7M 282.5M 26.7M 0B root root 3% multipathd
496 - 13376 1966 18.7M 1.8M 166.2M 132.0K 0.0K 710.3M 22.8M 25.3M 710.3M 22.8M 0B root root 2% snapd
30699 - 11046 22 2.7M 5.6M 21.0M 132.0K 0.0K 39.1M 22.0M 16.7M 39.1M 22.0M 0B flask flask 2% gunicorn
658 - 197263 2185 5.8M 16.3M 2.0M 132.0K 0.0K 210.1M 15.6M 4.1M 210.1M 15.6M 0B postgres postgres 2% postgres
92773 - 430 0 63.6M 3.4M 8.7M 132.0K 0.0K 78.0M 14.5M 9.4M 78.0M 14.5M 0B flask flask 2% node
720 - 872 52 5.8M 16.3M 2.2M 132.0K 0.0K 210.3M 14.0M 4.0M 210.3M 14.0M 0B postgres postgres 1% postgres
476 - 3103 56 2.7M 8.4M 10.4M 132.0K 0.0K 32.4M 13.2M 9.6M 32.4M 13.2M 0B root root 1% networkd-dispa
638 - 2551 27 2.7M 11.0M 17.1M 132.0K 0.0K 107.5M 11.0M 8.2M 107.5M 11.0M 0B root root 1% unattended-upg
1 - 156940 747 896.0K 8.8M 19.4M
ATOP - ip-172-31-3-66 2022/10/18 00:00:01 ----------------- 1s elapsed
PRC | sys 0.03s | user 0.02s | | | #proc 149 | | #trun 12 | #tslpi 148 | | #tslpu 44 | #zombie 0 | | clones 12 | | | #exit 0 |
CPU | sys 38% | | user 62% | irq 0% | | | idle 0% | | wait 0% | | steal 0% | guest 0% | | | curf 2.40GHz | |
CPL | avg1 0.01 | | avg5 0.10 | | avg15 0.15 | | | | csw 297 | | intr 84 | | | | numcpu 1 | |
MEM | tot 966.2M | free 242.5M | cache 304.1M | dirty 1.4M | buff 9.4M | slab 60.0M | slrec 26.1M | | | shmem 138.0M | shrss 0.0M | shswp 0.0M | | | | numnode 1 |
SWP | tot 0.0M | | free 0.0M | | swcac 0.0M | | | | | | | | vmcom 1.1G | | vmlim 483.1M | |
PSI | cpusome 11% | memsome 0% | | memfull 0% | iosome 1% | | iofull 0% | cs 0/1/3 | | ms 0/0/0 | mf 0/0/0 | | is 0/0/0 | if 0/0/0 | | |
DSK | xvda | busy 31% | | read 44 | write 3 | discrd 0 | | KiB/r 41 | KiB/w 29 | KiB/d 0 | | MBr/s 1.8 | MBw/s 0.1 | avq 0.80 | | avio 0.85 ms |
DSK | xvda1 | busy 31% | | read 44 | write 3 | discrd 0 | | KiB/r 41 | KiB/w 29 | KiB/d 0 | | MBr/s 1.8 | MBw/s 0.1 | avq 0.80 | | avio 0.85 ms |
PID TID MINFLT MAJFLT VSTEXT VSLIBS VDATA VSTACK LOCKSZ VSIZE RSIZE PSIZE VGROW RGROW SWAPSZ RUID EUID MEM CMD 1/6
30705 - 0 0 2.7M 10.2M 90.1M 132.0K 0.0K 241.5M 78.0M 71.3M 0B 0B 0B flask flask 8% gunicorn
30704 - 0 0 2.7M 10.2M 87.9M 132.0K 0.0K 239.3M 75.6M 69.0M 0B 0B 0B flask flask 8% gunicorn
170 - 2 6 92.0K 8.8M 49.2M 132.0K 0.0K 227.3M 44.1M 34.2M 0B 0B 0B root root 5% systemd-journa
30741 - 0 0 5.8M 16.5M 7.1M 132.0K 0.0K 215.5M 33.5M 14.7M 0B 0B 0B postgres postgres 3% postgres
30739 - 0 0 5.8M 16.5M 7.7M 132.0K 0.0K 216.1M 32.2M 14.1M 0B 0B 0B postgres postgres 3% postgres
212 - 0 0 80.0K 6.2M 18.0M 132.0K 282.5M 282.5M 26.7M 22.6M 0B 0B 0B root root 3% multipathd
496 - 0 0 18.7M 1.8M 166.2M 132.0K 0.0K 710.3M 22.8M 25.3M 0B 0B 0B root root 2% snapd
30699 - 0 0 2.7M 5.6M 21.0M 132.0K 0.0K 39.1M 22.0M 16.7M 0B 0B 0B flask flask 2% gunicorn
92773 - 416 0 63.6M 3.5M 43.0M 132.0K 0.0K 304.3M 20.5M 10.7M 226.3M 5.9M 0B flask flask 2% node
92779 - 454 0 63.6M 3.4M 41.9M 132.0K 0.0K 303.2M 19.6M 9.5M 233.3M 9.7M 0B flask flask 2% node
658 - 0 0 5.8M 16.3M 2.0M 132.0K 0.0K 210.1M 15.6M 4.1M 0B 0B 0B postgres postgres 2% postgres
720 - 0 0 5.8M 16.3M 2.2M 132.0K 0.0K 210.3M 14.0M 4.0M 0B 0B 0B postgres postgres 1% postgres
476 - 0 0 2.7M 8.4M 10.4M 132.0K 0.0K 32.4M 13.2M 9.5M 0B 0B 0B root root 1% networkd-dispa
638 - 0 0 2.7M 11.0M 17.1M 132.0K 0.0K 107.5M 11.0M 8.2M 0B 0B 0B root root 1% unattended-upg
1 - 43 0 896.0K 8.8M 19.4M 1.0M 0.0K 99.8M 10.1M 6.7M 0B 28.0K 0B root root 1% systemd
92751 - 543 0 172.0K 2.5M 4.7M 948.0K 9.7M 9.7M 9.6M 6.6M 1.2M 1.2M 0B root root 1% atop
1178 - 0 0 8.9M 8.0K 103.1M 132.0K 0.0K 1.2G 8.4M 12.8M 0B 0B 0B root root 1% ssm-agent-work
424 - 0 0 308.0K 12.0M 3.5M 132.0K 0.0K 24.8M 8.2M 4.9M 0B 0B 0B systemd- systemd- 1% systemd-resolv
721 - 0 0 5.8M 16.3M 2.0M
ATOP - ip-172-31-3-66 2022/10/18 00:02:01 ----------------- 1m0s elapsed
PRC | sys 59.57s | user 0.37s | | | #proc 145 | | #trun 2 | #tslpi 115 | | #tslpu 73 | #zombie 0 | | clones 4 | | | #exit 0 |
CPU | sys 99% | | user 0% | irq 1% | | | idle 0% | | wait 0% | | steal 0% | guest 0% | | | curf 2.40GHz | |
CPL | avg1 23.95 | | avg5 8.69 | | avg15 3.26 | | | | csw 121213 | | intr 73529 | | | | numcpu 1 | |
MEM | tot 966.2M | free 65.4M | cache 154.7M | dirty 0.0M | buff 0.1M | slab 58.9M | slrec 24.8M | | | shmem 138.0M | shrss 0.0M | shswp 0.0M | | | | numnode 1 |
SWP | tot 0.0M | | free 0.0M | | swcac 0.0M | | | | | | | | vmcom 1.4G | | vmlim 483.1M | |
PAG | scan 70955e3 | steal 1948e3 | | stall 0 | compact 0 | numamig 0 | | migrate 0 | | | | | swin 0 | swout 0 | | oomkill 0 |
PSI | cpusome 25% | memsome 99% | | memfull 83% | iosome 100% | | iofull 0% | cs 27/23/12 | | ms 99/83/31 | mf 82/70/26 | | is 100/84/31 | if 0/0/0 | | |
DSK | xvda | busy 100% | | read 57692 | write 69 | discrd 0 | | KiB/r 65 | KiB/w 5 | KiB/d 0 | | MBr/s 61.0 | MBw/s 0.0 | avq 22.51 | | avio 1.04 ms |
DSK | xvda1 | busy 100% | | read 57692 | write 69 | discrd 0 | | KiB/r 65 | KiB/w 5 | KiB/d 0 | | MBr/s 61.0 | MBw/s 0.0 | avq 22.51 | | avio 1.04 ms |
NET | transport | tcpi 0 | | tcpo 0 | udpi 8 | udpo 8 | | tcpao 0 | tcppo 0 | tcprs 0 | | tcpie 0 | tcpor 0 | udpnp 0 | | udpie 0 |
NET | network | | ipi 8 | ipo 8 | | ipfrw 0 | deliv 8 | | | | | | | icmpi 0 | icmpo 0 | |
NET | lo ---- | | pcki 8 | pcko 8 | | sp 0 Mbps | si 0 Kbps | so 0 Kbps | | | coll 0 | mlti 0 | erri 0 | erro 0 | drpi 0 | drpo 0 |
NET | eth0 ---- | | pcki 1 | pcko 1 | | sp 0 Mbps | si 0 Kbps | so 0 Kbps | | | coll 0 | mlti 0 | erri 0 | erro 0 | drpi 0 | drpo 0 |
PID TID MINFLT MAJFLT VSTEXT VSLIBS VDATA VSTACK LOCKSZ VSIZE RSIZE PSIZE VGROW RGROW SWAPSZ RUID EUID MEM CMD 1/7
30705 - 729 3985 2.7M 10.2M 90.1M 132.0K 0.0K 241.5M 74.4M 69.6M 0B 132.0K 0B flask flask 8% gunicorn
30704 - 697 4022 2.7M 10.2M 87.9M 132.0K 0.0K 239.3M 72.0M 67.2M 0B -0.4M 0B flask flask 7% gunicorn
92777 - 396 2827 2.7M 10.1M 42.1M 132.0K 0.0K 69.6M 43.8M 41.5M 0B 700.0K 0B flask flask 5% python
92780 - 642 2963 2.7M 10.1M 42.1M 132.0K 0.0K 69.6M 43.6M 41.5M 0B 772.0K 0B flask flask 5% python
92776 - 23 2432 2.7M 10.1M 42.4M 132.0K 0.0K 69.6M 43.3M 40.9M 4.3M 772.0K 0B flask flask 4% python
92786 - 622 2998 2.7M 6.1M 42.0M 132.0K 0.0K 60.8M 43.0M 41.0M 0B -0.4M 0B flask flask 4% python
92782 - 715 2977 2.7M 6.1M 42.0M 132.0K 0.0K 60.8M 42.7M 41.0M 0B -0.1M 0B flask flask 4% python
92783 - 583 2985 2.7M 6.1M 42.0M 132.0K 0.0K 60.8M 42.6M 41.0M 0B -0.3M 0B flask flask 4% python
92775 - 737 2983 2.7M 6.1M 42.0M 132.0K 0.0K 60.8M 42.6M 41.0M 0B 600.0K 0B flask flask 4% python
92767 - 675 2969 2.7M 6.1M 42.0M 132.0K 0.0K 60.8M 42.2M 40.6M 0B -0.4M 0B flask flask 4% python
92778 - 238 2779 2.7M 6.1M 42.0M 132.0K 0.0K 60.8M 42.1M 40.8M 0B -0.2M 0B flask flask 4% python
30741 - 0 0 5.8M 16.5M 7.1M 132.0K 0.0K 215.5M 27.5M 12.7M 0B 0B 0B postgres postgres 3% postgres
212 - 0 0 80.0K 6.2M 18.0M 132.0K 282.5M 282.5M 26.7M 22.6M 0B 0B 0B root root 3% multipathd
30739 - 0 0 5.8M 16.5M 7.7M 132.0K 0.0K 216.1M 26.6M 12.2M 0B 0B 0B postgres postgres 3% postgres
30699 - 563 2647 2.7M 5.6M 21.0M 132.0K 0.0K 39.1M 22.4M 16.5M 0B 512.0K 0B flask flask 2% gunicorn
496 - 2 70 18.7M 1.8M 166.2M 132.0K 0.0K 710.3M 16.5M 16.7M 0B 0B 0B root root 2% snapd
658 - 33 441 5.8M 16.3M 2.0M 132.0K 0.0K 210.1M 13.0M 3.8M 0B 300.0K 0B postgres postgres 1% postgres
720 - 47 1995 5.8M 16.3M 2.2M 132.0K 0.0K 210.3M 12.9M 3.7M 0B -0.6M 0B postgres postgres 1% postgres