My Server Info
AWS Ubuntu 18.04
MongoDB 4.2.17
4 Cores
16 GRAM
This server only running mongodb, and allocated 10G memory to mongodb
wiredTiger:
engineConfig:
cacheSizeGB: 10
But this machine has OOM every few days.
Sys Log:
Oct 19 18:25:16 kernel: [8693043.690043] oom-kill:constraint=CONSTRAINT_NONE,nodemask=(null),cpuset=/,mems_allowed=0,global_oom,task_memcg=/system.slice/mongod.service,task=mongod,pid=26154,uid=111
Oct 19 18:25:16 kernel: [8693043.690370] Out of memory: Killed process 26154 (mongod) total-vm:17824108kB, anon-rss:15602768kB, file-rss:0kB, shmem-rss:0kB, UID:111 pgtables:32288kB oom_score_adj:0
Oct 19 18:25:16 kernel: [8693044.284593] oom_reaper: reaped process 26154 (mongod), now anon-rss:0kB, file-rss:0kB, shmem-rss:0kB
Mongodb Log found some: "serverStatus was very slow"
2022-10-19T17:00:00.117+0000 I INDEX [conn41] index build: done building index _id_ on ns game.small_discard_stream_20221020
2022-10-19T17:00:00.140+0000 I INDEX [conn41] index build: starting on game.small_discard_stream_20221020 properties: { v: 2, key: { game_id: 1 }, name: "game_id_1", ns: "game.small_discard_stream_20221020", background: true } using method: Hybrid
2022-10-19T17:00:00.140+0000 I INDEX [conn41] build may temporarily use up to 200 megabytes of RAM
2022-10-19T17:00:00.142+0000 I INDEX [conn41] index build: collection scan done. scanned 1 total records in 0 seconds
2022-10-19T17:00:00.143+0000 I INDEX [conn41] index build: inserted 1 keys from external sorter into index in 0 seconds
2022-10-19T17:00:00.145+0000 I INDEX [conn41] index build: done building index game_id_1 on ns game.small_discard_stream_20221020
2022-10-19T17:06:02.695+0000 I COMMAND [conn5068] command admin.$cmd command: isMaster { ismaster: 1, $clusterTime: { clusterTime: Timestamp(1666199156, 26), signature: { hash: BinData(0, 0000000000000000000000000000000000000000), keyId: 0 } }, $db: "admin", $readPreference: { mode: "primary" } } numYields:0 reslen:696 locks:{} protocol:op_msg 448ms
2022-10-19T17:06:02.695+0000 I COMMAND [conn41] command game.small_discard_stream_20221020 command: listIndexes { listIndexes: "small_discard_stream_20221020", cursor: {}, lsid: { id: UUID("a6c6fd81-d9f3-400e-b7ab-41fa83ececd7") }, $clusterTime: { clusterTime: Timestamp(1666199161, 39), signature: { hash: BinData(0, 0000000000000000000000000000000000000000), keyId: 0 } }, $db: "game", $readPreference: { mode: "primaryPreferred" } } numYields:0 reslen:454 locks:{ ReplicationStateTransition: { acquireCount: { w: 1 } }, Global: { acquireCount: { r: 1 } }, Database: { acquireCount: { r: 1 } }, Collection: { acquireCount: { r: 1 } }, Mutex: { acquireCount: { r: 1 } } } storage:{} protocol:op_msg 494ms
2022-10-19T17:06:02.695+0000 I COMMAND [conn5066] command admin.$cmd command: isMaster { ismaster: 1, $clusterTime: { clusterTime: Timestamp(1666199157, 51), signature: { hash: BinData(0, 0000000000000000000000000000000000000000), keyId: 0 } }, $db: "admin", $readPreference: { mode: "primary" } } numYields:0 reslen:696 locks:{} protocol:op_msg 448ms
2022-10-19T17:06:15.571+0000 I COMMAND [ftdc] serverStatus was very slow: { after basic: 1, after asserts: 1, after connections: 1, after electionMetrics: 3, after extra_info: 3, after flowControl: 3, after freeMonitoring: 3, after globalLock: 3, after locks: 3, after logicalSessionRecordCache: 3, after network: 3, after opLatencies: 3, after opReadConcernCounters: 3, after opcounters: 3, after opcountersRepl: 3, after oplogTruncation: 9, after repl: 107, after scramCache: 121, after security: 121, after storageEngine: 160, after tcmalloc: 193, after trafficRecording: 193, after transactions: 206, after transportSecurity: 215, after twoPhaseCommitCoordinator: 219, after wiredTiger: 682, at end: 3549 }2022-10-19T17:09:14.535+0000 I COMMAND [conn7423] command admin.$cmd command: isMaster { ismaster: 1, $clusterTime: { clusterTime: Timestamp(1666199158, 15), signature: { hash: BinData(0, 0000000000000000000000000000000000000000), keyId: 0 } }, $db: "admin", $readPreference: { mode: "primary" } } numYields:0 reslen:696 locks:{} protocol:op_msg 159574ms
Then the machine becomes unresponsive, SSH ,Ping also can't connect,until restart the server. In fact this server doesn't have much data and queries.
How to avoid mongodb OOM? anyone has ideas?
Thanks!