OOM 杀手级杀应用程序时,它仍然有内存

OOM killer kill app when its still have memory

提问人:Pawel Rutka 提问时间:11/8/2023 更新时间:11/8/2023 访问量:50

问:

我们有这样的问题,应用程序OUR_APP1被oom杀死了,但我们无法真正掌握如何调试它。来自 oom/top 的所有报告看起来系统中仍然有很多内存。它的 aarch64 你能指导我调试什么吗(如果需要,我可以检测内核)。这是一个日志

SWAP 已禁用。有时它会在启动后发生,有时在长时间运行后 - 但后来我看到经常 kswapd 运行 - 看起来如果它有效,它仍然在边缘保持平衡。

[ 2861.660652] systemd-journal invoked oom-killer: gfp_mask=0xcc0(GFP_KERNEL), order=0, oom_score_adj=-250
[ 2861.660658] CPU: 0 PID: 159 Comm: systemd-journal Tainted: G        W  O      5.10.162-g999 #1
[ 2861.660661] Hardware name: XYZ
[ 2861.660665] Call trace:
[ 2861.660683]  dump_backtrace+0x0/0x1d8
[ 2861.660688]  show_stack+0x20/0x30
[ 2861.660693]  dump_stack+0xd0/0x12c
[ 2861.660698]  dump_header+0x50/0x1a8
[ 2861.660701]  oom_kill_process+0x9c/0x190
[ 2861.660704]  out_of_memory+0x33c/0x3cc
[ 2861.660711]  __alloc_pages_slowpath.constprop.0+0xaec/0xba0
[ 2861.660714]  __alloc_pages_nodemask+0x2a0/0x320
[ 2861.660718]  __get_free_pages+0x24/0x60
[ 2861.660724]  proc_pid_cmdline_read+0x1d4/0x450
[ 2861.660729]  vfs_read+0xb4/0x1a8
[ 2861.660733]  ksys_read+0x74/0x100
[ 2861.660736]  __arm64_sys_read+0x24/0x30
[ 2861.660745]  el0_svc_common.constprop.0+0x80/0x1d0
[ 2861.660749]  do_el0_svc+0x2c/0x98
[ 2861.660753]  el0_svc+0x20/0x30
[ 2861.660756]  el0_sync_handler+0xb0/0xb8
[ 2861.660760]  el0_sync+0x180/0x1c0
[ 2861.660763] Mem-Info:
[ 2861.660770] active_anon:9341 inactive_anon:133327 isolated_anon:0
             active_file:544 inactive_file:762 isolated_file:0
             unevictable:0 dirty:0 writeback:0
             slab_reclaimable:3745 slab_unreclaimable:6130
             mapped:54794 shmem:73108 pagetables:1103 bounce:0
             free:45704 free_pcp:314 free_cma:38416
[ 2861.660777] Node 0 active_anon:37364kB inactive_anon:533308kB active_file:2176kB inactive_file:3048kB unevictable:0kB isolated(anon):0kB isolated(file):0kB mapped:219176kB dirty:0kB writeback:0kB shmem:292432kB shmem_thp: 0kB shmem_pmdmapped: 0kB anon_thp: 245760kB writeback_tmp:0kB kernel_stack:2688kB all_unreclaimable? no
[ 2861.660806] DMA free:182816kB min:22528kB low:28160kB high:33792kB reserved_highatomic:12288KB active_anon:37364kB inactive_anon:533308kB active_file:2536kB inactive_file:3408kB unevictable:0kB writepending:0kB present:2097148kB managed:1060296kB mlocked:0kB pagetables:4412kB bounce:0kB free_pcp:1260kB local_pcp:672kB free_cma:153664kB
[ 2861.660816] lowmem_reserve[]: 0 0 0 0
[ 2861.660826] DMA: 1864*4kB (UMEHC) 829*8kB (UMEHC) 237*16kB (UMEHC) 126*32kB (UMEC) 42*64kB (MEC) 22*128kB (UMEC) 10*256kB (UMC) 21*512kB (UC) 27*1024kB (UC) 0*2048kB 28*4096kB (C) = 183064kB
[ 2861.660864] Node 0 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=1048576kB
[ 2861.660868] Node 0 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=32768kB
[ 2861.660894] Node 0 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=2048kB
[ 2861.660897] Node 0 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=64kB
[ 2861.660900] 74443 total pagecache pages
[ 2861.660904] 0 pages in swap cache
[ 2861.660907] Swap cache stats: add 0, delete 0, find 0/0
[ 2861.660909] Free swap  = 0kB
[ 2861.660911] Total swap = 0kB
[ 2861.660914] 524287 pages RAM
[ 2861.660917] 0 pages HighMem/MovableOnly
[ 2861.660919] 259213 pages reserved
[ 2861.660921] 131072 pages cma reserved
[ 2861.660923] 0 pages hwpoisoned
[ 2861.660925] Tasks state (memory values in pages):
[ 2861.660928] [  pid  ]   uid  tgid total_vm      rss pgtables_bytes swapents oom_score_adj name
[ 2861.660945] [    158]   999   158      921      133    40960        0             0 rpcbind
[ 2861.660949] [    159]     0   159    10467     2777    90112        0          -250 systemd-journal
[ 2861.660954] [    179]     0   179     3388      309    53248        0         -1000 systemd-udevd
[ 2861.660959] [    220]   991   220    20274      281    57344        0             0 systemd-timesyn
[ 2861.660964] [    225]     0   225    37718       35    57344        0             0 rngd
[ 2861.660967] [    230]     0   230      551       46    40960        0             0 atd
[ 2861.660971] [    234]     0   234      662      142    40960        0             0 crond
[ 2861.660976] [    238]   997   238     1125      149    45056        0          -900 dbus-daemon
[ 2861.660980] [    242]     0   242    20559      187    57344        0             0 fast_developmen
[ 2861.660984] [    249]     0   249      517      107    40960        0             0 syslogd
[ 2861.660988] [    271]   993   271     1898      269    53248        0             0 systemd-network
[ 2861.660993] [    277]     0   277     1753      237    49152        0             0 systemd-logind
[ 2861.660997] [    295]   992   295     1788      165    49152        0             0 systemd-resolve
[ 2861.661001] [    302]   995   302     1209      129    49152        0             0 avahi-daemon
[ 2861.661005] [    303]   995   303     1178       60    49152        0             0 avahi-daemon
[ 2861.661009] [    304]   996   304      784      203    45056        0             0 rpc.statd
[ 2861.661013] [    306] 64371   306      801       31    45056        0             0 ninfod
[ 2861.661017] [    307] 61563   307      476       78    40960        0             0 rdisc
[ 2861.661022] [    309]     0   309      640       23    40960        0             0 xinetd
[ 2861.661025] [    310]     0   310      714      110    45056        0             0 phc2sys
[ 2861.661030] [    311]     0   311      731      140    45056        0             0 ptp4l
[ 2861.661033] [    312]     0   312      509       75    40960        0             0 agetty
[ 2861.661038] [    313]     0   313     1133      161    45056        0             0 login
[ 2861.661041] [    316]     0   316     2148      242    49152        0             0 systemd
[ 2861.661046] [    317]     0   317     2685      418    53248        0             0 (sd-pam)
[ 2861.661050] [    322]     0   322      913      232    40960        0             0 sh
[ 2861.661053] [    329]     0   329     1766      348    45056        0             0 amsr_vector_fs_
[ 2861.661058] [    330]     0   330   127615    51832   598016        0             0 OUR_APP3
[ 2861.661062] [    331]     0   331   489733    86090  1626112        0             0 OUR_APP1
[ 2861.661065] [    332]     0   332     9268      334    69632        0             0 ABCD
[ 2861.661069] [    455]     0   455   260403    31150  1105920        0             0 OUR_APP2
[ 2861.661073] oom-kill:constraint=CONSTRAINT_NONE,nodemask=(null),cpuset=/,mems_allowed=0,global_oom,task_memcg=/system.slice/autosar_adaptive_em.service,task=OUR_APP1,pid=331,uid=0
[ 2861.661195] Out of memory: Killed process 331 (OUR_APP1) total-vm:1958932kB, anon-rss:138568kB, file-rss:976kB, shmem-rss:204816kB, UID:0 pgtables:1588kB oom_score_adj:0
[ 2861.680183]```
linux-kernel oom

评论

0赞 commanderdata 11/8/2023
您是否有关于内存 C 组控制器 (memcg) 如何限制系统内存使用的详细信息?在进程运行时,您可以在 中找到每个进程的内存限制(以及许多其他有用的信息)。如果这是一个 C/C++ 应用程序,并且您怀疑内存泄漏,那么 Valgrind 是一个非常有用的工具。/proc/{pid}

答: 暂无答案