诊断 Redis 数据库故障(OOM 终止)

diagnosing redis db failure (oom-killed)

提问人:xtyy 提问时间:11/15/2023 更新时间:11/15/2023 访问量:14

问:

如何调试 OOM Killer? 我有一个 Bun-Redis 数据库,其中包含 100 个 WebSocket 连接,将实时数据直接存储到我的数据库中,数据逐出策略为 10 天。

A process of this unit has been killed by the OOM killer.
Killing process 77506 (HeapHelper) with signal SIGKILL.
Failed with result 'oom-kill'.
Consumed 1d 22h 10min 6.378s CPU time.

~ ❯ free -h
               total        used        free      shared  buff/cache   available
Mem:            27Gi       7.4Gi        18Gi        11Mi       1.1Gi        19Gi
127.0.0.1:6379> MEMORY STATS
 1) "peak.allocated"
 2) (integer) 4547468208
 3) "total.allocated"
 4) (integer) 4547247816
 5) "startup.allocated"
 6) (integer) 1069176
 7) "replication.backlog"
 8) (integer) 0
 9) "clients.slaves"
10) (integer) 0
11) "clients.normal"
12) (integer) 1928
13) "cluster.links"
14) (integer) 0
15) "aof.buffer"
16) (integer) 0
17) "lua.caches"
18) (integer) 0
19) "functions.caches"
20) (integer) 184
21) "db.0"
22) 1) "overhead.hashtable.main"
    2) (integer) 241912
    3) "overhead.hashtable.expires"
    4) (integer) 88
23) "overhead.total"
24) (integer) 1313288
25) "keys.count"
26) (integer) 4408
27) "keys.bytes-per-key"
28) (integer) 1031347
29) "dataset.bytes"
30) (integer) 4545934528
31) "dataset.percentage"
32) "99.99462890625"
33) "peak.percentage"
34) "99.99514770507813"
35) "allocator.allocated"
36) (integer) 4547537240
37) "allocator.active"
38) (integer) 4547805184
39) "allocator.resident"
40) (integer) 4695977984
41) "allocator-fragmentation.ratio"
42) "1.000058889389038"
43) "allocator-fragmentation.bytes"
44) (integer) 267944
45) "allocator-rss.ratio"
46) "1.0325812101364136"
47) "allocator-rss.bytes"
48) (integer) 148172800
49) "rss-overhead.ratio"
50) "1.0022660493850708"
51) "rss-overhead.bytes"
52) (integer) 10641408
53) "fragmentation"
54) "1.0350526571273804"
55) "fragmentation.bytes"
56) (integer) 159392128

似乎我的峰值内存分配上限为 4.5GB,并且我有足够的内存储备 在 oom 杀死后,我再次启动了 redis 数据库,这就是我看到的:

5:09:46.782 * <search> Loading event starts
5:09:46.782 * Loading RDB produced by version 255.255.255
5:09:46.782 * RDB age 53411 seconds
5:09:46.782 * RDB memory usage when created 23946.19 Mb
5:09:53.947 * Done loading RDB, keys loaded: 4408, keys expired: 0.
5:09:53.947 # <search> Skip background reindex scan, redis version contains loaded event.
5:09:53.947 * <search> Loading event ends
5:09:53.947 * DB loaded from disk: 7.166 seconds

我怎样才能开始诊断这个问题?

Linux Redis Bun OOM

评论


答: 暂无答案