Looking for advice on how to diagnose high system/kernel space load.
Setup
- Docker CE on CentOS 7 (no SELinux)
- Containerized Python based AI/ML workload.
- Python multi-processing is in use with ((n/2)-2) pool workers
- n => cpu/vcpu cores (1 hyper-thread)
General Issue
- On Bare metal container from our docker image runs as expected.
- On GCP container based on same image runs slow, 3-4 times slow.
- AI Pipelines that originally took 2 days now taking 4/5 days.
Specific Issue
- GCP top -> high system/kernel space load (30-50).
- Bare-metal(BM) top -> low system load (1-10).
- BM Individual Processes at most 100-120 %CPU.
- GCP individual processes 300-500 %CPU.
- BM Load Average 11.85, 32.87, 19.43
- GCP LA 43.82, 32.87, 19.43
What we Tried
- Set full privileged=true on docker run
- Doubled the vCPU on GCP (32->64)
- Used C2 (Compute Intensive) machine type.
Since I'm having no clue on how to diagnose this top-down thus also tried perf to see what is happening in the kernel/system space, following is the output from both systems.
GCP VM 'perf trace'
Summary of events:
python (2551), 6580977 events, 3.9%
syscall calls total min avg max stddev
(msec) (msec) (msec) (msec) (%)
--------------- -------- --------- --------- --------- --------- ------
sched_yield 3232286 50396.443 0.002 0.016 68.851 0.69%
futex 58139 23715.636 0.001 0.408 21989.789 92.73%
brk 1643 312.000 0.002 0.190 19.195 13.42%
close 5 35.710 3.630 7.142 18.245 39.07%
openat 4 12.764 2.710 3.191 3.767 7.16%
write 165 9.681 0.003 0.059 6.496 66.95%
ioctl 85 0.605 0.002 0.007 0.028 9.58%
lseek 182 0.460 0.002 0.003 0.004 1.70%
read 10 0.104 0.003 0.010 0.019 15.84%
fstat 4 0.030 0.006 0.007 0.008 6.21%
python (2847), 10240661 events, 6.0%
syscall calls total min avg max stddev
(msec) (msec) (msec) (msec) (%)
--------------- -------- --------- --------- --------- --------- ------
futex 2909 269175.495 0.001 92.532 21907.964 13.70%
sched_yield 5119771 63387.192 0.002 0.012 98.500 0.60%
ython (2848), 10952416 events, 6.4%
syscall calls total min avg max stddev
(msec) (msec) (msec) (msec) (%)
--------------- -------- --------- --------- --------- --------- ------
futex 2944 269333.572 0.001 91.486 21908.135 13.70%
sched_yield 5475608 62191.735 0.002 0.011 54.373 0.56%
python (2849), 10095381 events, 5.9%
syscall calls total min avg max stddev
(msec) (msec) (msec) (msec) (%)
--------------- -------- --------- --------- --------- --------- ------
futex 2869 269495.562 0.001 93.934 21907.785 13.69%
sched_yield 5046875 63404.111 0.002 0.013 69.027 0.53%
python (2850), 10871629 events, 6.4%
syscall calls total min avg max stddev
(msec) (msec) (msec) (msec) (%)
--------------- -------- --------- --------- --------- --------- ------
futex 3379 269541.688 0.001 79.770 21908.129 13.71%
sched_yield 5434655 61881.631 0.002 0.011 92.192 0.66%
python (2851), 10471715 events, 6.2%
syscall calls total min avg max stddev
(msec) (msec) (msec) (msec) (%)
--------------- -------- --------- --------- --------- --------- ------
futex 3267 269665.818 0.001 82.542 21908.306 13.70%
sched_yield 5234762 62440.899 0.002 0.012 70.764 0.59%
python (2852), 11048544 events, 6.5%
syscall calls total min avg max stddev
(msec) (msec) (msec) (msec) (%)
--------------- -------- --------- --------- --------- --------- ------
futex 3075 269845.352 0.001 87.755 21908.402 13.69%
sched_yield 5523384 61269.561 0.002 0.011 77.207 0.58%
python (2853), 10802913 events, 6.4%
syscall calls total min avg max stddev
(msec) (msec) (msec) (msec) (%)
--------------- -------- --------- --------- --------- --------- ------
futex 3411 269995.415 0.001 79.154 21907.717 13.70%
sched_yield 5400121 61586.390 0.000 0.011 63.498 0.63%
python (2854), 10580227 events, 6.2%
syscall calls total min avg max stddev
(msec) (msec) (msec) (msec) (%)
--------------- -------- --------- --------- --------- --------- ------
futex 3237 270029.557 0.001 83.420 21907.737 13.69%
sched_yield 5289038 62372.649 0.002 0.012 50.329 0.56%
python (2855), 11269016 events, 6.6%
syscall calls total min avg max stddev
(msec) (msec) (msec) (msec) (%)
--------------- -------- --------- --------- --------- --------- ------
futex 3309 270088.861 0.001 81.623 21908.143 13.69%
sched_yield 5633482 61010.474 0.002 0.011 59.756 0.54%
python (2856), 11194619 events, 6.6%
syscall calls total min avg max stddev
(msec) (msec) (msec) (msec) (%)
--------------- -------- --------- --------- --------- --------- ------
futex 3355 270348.851 0.001 80.581 21917.988 13.68%
sched_yield 5596230 61122.634 0.002 0.011 52.892 0.60%
python (2857), 11236749 events, 6.6%
syscall calls total min avg max stddev
(msec) (msec) (msec) (msec) (%)
--------------- -------- --------- --------- --------- --------- ------
futex 2559 273140.199 0.001 106.737 21916.745 13.53%
sched_yield 5618107 61171.202 0.002 0.011 74.198 0.59%
python (2858), 10328444 events, 6.1%
syscall calls total min avg max stddev
(msec) (msec) (msec) (msec) (%)
--------------- -------- --------- --------- --------- --------- ------
futex 2759 270424.625 0.001 98.015 21916.130 13.66%
sched_yield 5163708 62490.561 0.002 0.012 95.956 0.65%
python (2859), 11507002 events, 6.8%
syscall calls total min avg max stddev
(msec) (msec) (msec) (msec) (%)
--------------- -------- --------- --------- --------- --------- ------
futex 3065 270481.725 0.001 88.249 21910.849 13.67%
sched_yield 5752618 60412.759 0.002 0.011 96.043 0.69%
python (2860), 11013249 events, 6.5%
syscall calls total min avg max stddev
(msec) (msec) (msec) (msec) (%)
--------------- -------- --------- --------- --------- --------- ------
futex 3254 270691.944 0.001 83.187 21912.965 13.67%
sched_yield 5505249 60965.211 0.002 0.011 69.026 0.62%
python (2861), 11793518 events, 6.9%
syscall calls total min avg max stddev
(msec) (msec) (msec) (msec) (%)
--------------- -------- --------- --------- --------- --------- ------
futex 2851 270700.999 0.001 94.949 21911.187 13.66%
sched_yield 5896062 60165.571 0.002 0.010 69.030 0.69%
python (2862), 1 events, 0.0%
syscall calls total min avg max stddev
(msec) (msec) (msec) (msec) (%)
--------------- -------- --------- --------- --------- --------- ------
futex 1 0.000 0.000 0.000 0.000 0.00%
python (2863), 1 events, 0.0%
syscall calls total min avg max stddev
(msec) (msec) (msec) (msec) (%)
--------------- -------- --------- --------- --------- --------- ------
futex 1 0.000 0.000 0.000 0.000 0.00%
python (2864), 1 events, 0.0%
syscall calls total min avg max stddev
(msec) (msec) (msec) (msec) (%)
--------------- -------- --------- --------- --------- --------- ------
futex 1 0.000 0.000 0.000 0.000 0.00%
python (2865), 1 events, 0.0%
syscall calls total min avg max stddev
(msec) (msec) (msec) (msec) (%)
--------------- -------- --------- --------- --------- --------- ------
futex 1 0.000 0.000 0.000 0.000 0.00%
TOP (GCP VM)
top - 11:09:17 up 3 min, 1 user, load average: 35.10, 12.57, 4.53
Tasks: 337 total, 15 running, 322 sleeping, 0 stopped, 0 zombie
%Cpu(s): 52.0 us, 43.1 sy, 0.0 ni, 4.9 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st
GiB Mem : 28.1 total, 24.3 free, 3.3 used, 0.6 buff/cache
GiB Swap: 0.0 total, 0.0 free, 0.0 used. 24.5 avail Mem
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
2501 root 20 0 2955128 169656 7612 R 380.5 0.6 3:24.55 python
2502 root 20 0 2959472 176016 7612 R 307.6 0.6 3:17.58 python
2495 root 20 0 2959088 176400 7608 R 288.4 0.6 3:35.93 python
2496 root 20 0 2933612 151452 7600 R 262.9 0.5 3:27.08 python
2494 root 20 0 2933356 153248 7604 R 253.3 0.5 3:22.47 python
2489 root 20 0 2949156 166916 7652 R 250.7 0.6 3:13.78 python
2500 root 20 0 2941704 153928 7632 R 225.5 0.5 3:24.54 python
2499 root 20 0 2950132 164576 7620 R 210.6 0.6 3:14.79 python
2493 root 20 0 2936304 151568 7608 R 177.2 0.5 3:28.18 python
2490 root 20 0 2949248 169748 7596 R 170.9 0.6 3:16.57 python
2497 root 20 0 2932684 156128 7600 R 167.9 0.5 3:13.12 python
2491 root 20 0 2942860 165400 7608 R 149.0 0.6 3:17.05 python
2492 root 20 0 2947632 163008 7604 R 117.2 0.6 3:20.01 python
2498 root 20 0 2935784 153348 7600 R 76.2 0.5 3:14.54 python
2943 root 20 0 162376 2560 1588 R 0.7 0.0 0:00.08 top
1 root 20 0 191028 4024 2592 S 0.0 0.0 0:01.68 systemd
Bare Metal 'perf trace'
Summary of events:
python (113694), 88758 events, 2.7%
syscall calls total min avg max stddev
(msec) (msec) (msec) (msec) (%)
--------------- -------- --------- --------- --------- --------- ------
futex 3174 13767.708 0.001 4.338 13745.177 99.84%
sched_yield 39353 130.937 0.002 0.003 3.972 3.91%
brk 1132 91.005 0.002 0.080 4.163 15.54%
write 101 1.928 0.002 0.019 0.673 34.59%
lseek 110 0.310 0.001 0.003 0.032 12.49%
openat 2 0.234 0.112 0.117 0.122 4.49%
ioctl 36 0.162 0.002 0.005 0.010 7.15%
read 6 0.033 0.003 0.005 0.008 16.09%
fstat 2 0.008 0.004 0.004 0.004 2.01%
close 3 0.007 0.002 0.002 0.003 5.55%
python (114304), 3210496 events, 97.3%
syscall calls total min avg max stddev
(msec) (msec) (msec) (msec) (%)
--------------- -------- --------- --------- --------- --------- ------
futex 3640 97379.595 0.000 26.753 13930.078 20.68%
sched_yield 1599811 3995.576 0.001 0.002 15.957 0.91%
python (114305), 1 events, 0.0%
syscall calls total min avg max stddev
(msec) (msec) (msec) (msec) (%)
--------------- -------- --------- --------- --------- --------- ------
futex 1 0.000 0.000 0.000 0.000 0.00%
python (114306), 1 events, 0.0%
syscall calls total min avg max stddev
(msec) (msec) (msec) (msec) (%)
--------------- -------- --------- --------- --------- --------- ------
futex 1 0.000 0.000 0.000 0.000 0.00%
python (114307), 1 events, 0.0%
syscall calls total min avg max stddev
(msec) (msec) (msec) (msec) (%)
--------------- -------- --------- --------- --------- --------- ------
futex 1 0.000 0.000 0.000 0.000 0.00%
python (114308), 1 events, 0.0%
syscall calls total min avg max stddev
(msec) (msec) (msec) (msec) (%)
--------------- -------- --------- --------- --------- --------- ------
futex 1 0.000 0.000 0.000 0.000 0.00%
python (114309), 1 events, 0.0%
syscall calls total min avg max stddev
(msec) (msec) (msec) (msec) (%)
--------------- -------- --------- --------- --------- --------- ------
futex 1 0.000 0.000 0.000 0.000 0.00%
python (114310), 1 events, 0.0%
syscall calls total min avg max stddev
(msec) (msec) (msec) (msec) (%)
--------------- -------- --------- --------- --------- --------- ------
futex 1 0.000 0.000 0.000 0.000 0.00%
python (114311), 1 events, 0.0%
syscall calls total min avg max stddev
(msec) (msec) (msec) (msec) (%)
--------------- -------- --------- --------- --------- --------- ------
futex 1 0.000 0.000 0.000 0.000 0.00%
python (114312), 1 events, 0.0%
syscall calls total min avg max stddev
(msec) (msec) (msec) (msec) (%)
--------------- -------- --------- --------- --------- --------- ------
futex 1 0.000 0.000 0.000 0.000 0.00%
python (114313), 1 events, 0.0%
syscall calls total min avg max stddev
(msec) (msec) (msec) (msec) (%)
--------------- -------- --------- --------- --------- --------- ------
futex 1 0.000 0.000 0.000 0.000 0.00%
python (114314), 1 events, 0.0%
syscall calls total min avg max stddev
(msec) (msec) (msec) (msec) (%)
--------------- -------- --------- --------- --------- --------- ------
futex 1 0.000 0.000 0.000 0.000 0.00%
python (114315), 1 events, 0.0%
syscall calls total min avg max stddev
(msec) (msec) (msec) (msec) (%)
--------------- -------- --------- --------- --------- --------- ------
futex 1 0.000 0.000 0.000 0.000 0.00%
python (114316), 1 events, 0.0%
syscall calls total min avg max stddev
(msec) (msec) (msec) (msec) (%)
--------------- -------- --------- --------- --------- --------- ------
futex 1 0.000 0.000 0.000 0.000 0.00%
python (114317), 1 events, 0.0%
syscall calls total min avg max stddev
(msec) (msec) (msec) (msec) (%)
--------------- -------- --------- --------- --------- --------- ------
futex 1 0.000 0.000 0.000 0.000 0.00%
python (114318), 1 events, 0.0%
syscall calls total min avg max stddev
(msec) (msec) (msec) (msec) (%)
--------------- -------- --------- --------- --------- --------- ------
futex 1 0.000 0.000 0.000 0.000 0.00%
python (114319), 1 events, 0.0%
syscall calls total min avg max stddev
(msec) (msec) (msec) (msec) (%)
--------------- -------- --------- --------- --------- --------- ------
futex 1 0.000 0.000 0.000 0.000 0.00%
python (114320), 1 events, 0.0%
syscall calls total min avg max stddev
(msec) (msec) (msec) (msec) (%)
--------------- -------- --------- --------- --------- --------- ------
futex 1 0.000 0.000 0.000 0.000 0.00%
python (114321), 1 events, 0.0%
syscall calls total min avg max stddev
(msec) (msec) (msec) (msec) (%)
--------------- -------- --------- --------- --------- --------- ------
futex 1 0.000 0.000 0.000 0.000 0.00%
python (114322), 1 events, 0.0%
syscall calls total min avg max stddev
(msec) (msec) (msec) (msec) (%)
--------------- -------- --------- --------- --------- --------- ------
futex 1 0.000 0.000 0.000 0.000 0.00%
python (114323), 1 events, 0.0%
syscall calls total min avg max stddev
(msec) (msec) (msec) (msec) (%)
--------------- -------- --------- --------- --------- --------- ------
futex 1 0.000 0.000 0.000 0.000 0.00%
python (114324), 1 events, 0.0%
syscall calls total min avg max stddev
(msec) (msec) (msec) (msec) (%)
--------------- -------- --------- --------- --------- --------- ------
futex 1 0.000 0.000 0.000 0.000 0.00%
python (114325), 1 events, 0.0%
syscall calls total min avg max stddev
(msec) (msec) (msec) (msec) (%)
--------------- -------- --------- --------- --------- --------- ------
futex 1 0.000 0.000 0.000 0.000 0.00%
python (114326), 1 events, 0.0%
syscall calls total min avg max stddev
(msec) (msec) (msec) (msec) (%)
--------------- -------- --------- --------- --------- --------- ------
futex 1 0.000 0.000 0.000 0.000 0.00%
python (114327), 1 events, 0.0%
syscall calls total min avg max stddev
(msec) (msec) (msec) (msec) (%)
--------------- -------- --------- --------- --------- --------- ------
futex 1 0.000 0.000 0.000 0.000 0.00%
python (114328), 1 events, 0.0%
syscall calls total min avg max stddev
(msec) (msec) (msec) (msec) (%)
--------------- -------- --------- --------- --------- --------- ------
futex 1 0.000 0.000 0.000 0.000 0.00%
python (114329), 1 events, 0.0%
syscall calls total min avg max stddev
(msec) (msec) (msec) (msec) (%)
--------------- -------- --------- --------- --------- --------- ------
futex 1 0.000 0.000 0.000 0.000 0.00%
python (114330), 1 events, 0.0%
syscall calls total min avg max stddev
(msec) (msec) (msec) (msec) (%)
--------------- -------- --------- --------- --------- --------- ------
futex 1 0.000 0.000 0.000 0.000 0.00%
python (114331), 1 events, 0.0%
syscall calls total min avg max stddev
(msec) (msec) (msec) (msec) (%)
--------------- -------- --------- --------- --------- --------- ------
futex 1 0.000 0.000 0.000 0.000 0.00%
python (114332), 1 events, 0.0%
syscall calls total min avg max stddev
(msec) (msec) (msec) (msec) (%)
--------------- -------- --------- --------- --------- --------- ------
futex 1 0.000 0.000 0.000 0.000 0.00%
python (114333), 1 events, 0.0%
syscall calls total min avg max stddev
(msec) (msec) (msec) (msec) (%)
--------------- -------- --------- --------- --------- --------- ------
futex 1 0.000 0.000 0.000 0.000 0.00%
python (114334), 1 events, 0.0%
syscall calls total min avg max stddev
(msec) (msec) (msec) (msec) (%)
--------------- -------- --------- --------- --------- --------- ------
futex 1 0.000 0.000 0.000 0.000 0.00%
python (114335), 1 events, 0.0%
syscall calls total min avg max stddev
(msec) (msec) (msec) (msec) (%)
--------------- -------- --------- --------- --------- --------- ------
futex 1 0.000 0.000 0.000 0.000 0.00%
python (114336), 1 events, 0.0%
syscall calls total min avg max stddev
(msec) (msec) (msec) (msec) (%)
--------------- -------- --------- --------- --------- --------- ------
futex 1 0.000 0.000 0.000 0.000 0.00%
python (114337), 1 events, 0.0%
syscall calls total min avg max stddev
(msec) (msec) (msec) (msec) (%)
--------------- -------- --------- --------- --------- --------- ------
futex 1 0.000 0.000 0.000 0.000 0.00%
python (114338), 1 events, 0.0%
syscall calls total min avg max stddev
(msec) (msec) (msec) (msec) (%)
--------------- -------- --------- --------- --------- --------- ------
futex 1 0.000 0.000 0.000 0.000 0.00%
python (114339), 1 events, 0.0%
syscall calls total min avg max stddev
(msec) (msec) (msec) (msec) (%)
--------------- -------- --------- --------- --------- --------- ------
futex 1 0.000 0.000 0.000 0.000 0.00%
python (114340), 1 events, 0.0%
syscall calls total min avg max stddev
(msec) (msec) (msec) (msec) (%)
--------------- -------- --------- --------- --------- --------- ------
futex 1 0.000 0.000 0.000 0.000 0.00%
python (114341), 1 events, 0.0%
syscall calls total min avg max stddev
(msec) (msec) (msec) (msec) (%)
--------------- -------- --------- --------- --------- --------- ------
futex 1 0.000 0.000 0.000 0.000 0.00%
Apparently too many events in GCP, specifically futex
and sched_yield
, both clearly taking lot of time in the system space.
Thanks