如何正确使用 libdwarf 信息获取局部变量位置-解网

问：

前言：对于我的问题的冗长准备，我深表歉意，这样做的原因是为了确保这篇文章是独立的，并希望包含我发现的所有必要信息。

我的问题与Eli Bendersky先生的这篇好文章有关 https://eli.thegreenplace.net/2011/02/07/how-debuggers-work-part-3-debugging-information

因此，我将使用下面的输入代码来回答我的问题：

#include <stdio.h>
void do_stuff(int my_arg)
{
    int my_local = my_arg + 2;
    int i;

    for (i = 0; i < my_local; ++i)
        printf("i = %d\n", i);
}
int main()
{
    do_stuff(2);
    return 0;
}

以上代码编译完成gcc -g tracedprog2.c -o tracedprog2

此外，我将使用这里分享的 libdwarf 示例 https://github.com/timsnyder/libdwarf-code/tree/3e75142a5d8938466e00a942c41a04f69510915d 可以通过以下步骤轻松构建，以使用该程序来复制我的发现（这不是必需的，只是想分享以防有人可能正在寻找它）：

cd libdwarf-code
mkdir build && cd build
cmake -DBUILD_DWARFEXAMPLE=TRUE ..
make -j4
// built binaries will be available in the directory: $HOME/libdwarf-code/build/src/bin/dwarfexample

问题如标题所述，如何使用 libdwarf 收集的信息来获取局部变量的位置？

因此，正如 Bendersky 先生的帖子中所说，首先要做的是通过获取 libdwarf 信息，这将输出如下信息（我只包含有用的信息）：objdump --dwarf=info ./tracedprog2

<1><8a>: Abbrev Number: 5 (DW_TAG_subprogram)                                                                                                                
    <8b>   DW_AT_external    : 1                                                                                                                              
    <8b>   DW_AT_name        : (indirect string, offset: 0x29): do_stuff                                                                                      
...                                                                                                                          
    <92>   DW_AT_low_pc      : 0x1135
    <9a>   DW_AT_high_pc     : 0x43
    <a2>   DW_AT_frame_base  : 1 byte block: 9c         (DW_OP_call_frame_cfa)
    <a4>   DW_AT_GNU_all_tail_call_sites: 1
...
 <2><b3>: Abbrev Number: 7 (DW_TAG_variable)
    <b4>   DW_AT_name        : (indirect string, offset: 0x0): my_local
...
    <bb>   DW_AT_type        : <0x57>
    <bf>   DW_AT_location    : 2 byte block: 91 68      (DW_OP_fbreg: -24)

我的理解是，为了弄清楚局部变量的位置，需要许多信息（显示为操作码）：

libdwarf 的帧基础：DW_OP_call_frame_cfa
libdwarf 的局部变量偏移量：DW_OP_fbreg

现在事情变得非常棘手，在阅读了 DWARF 指南（https://dwarfstd.org/doc/DWARF5.pdf）后，它指出：

The DW_OP_call_frame_cfa operation pushes the value of the CFA, obtained from the Call Frame Information (see Section 6.4 on page 171)

这是上面共享的二进制文件（https://github.com/timsnyder/libdwarf-code/tree/3e75142a5d8938466e00a942c41a04f69510915d/src/bin/dwarfexample）尝试将此 CFA 信息解析为用户可读格式的地方。frame1dwarfexample

运行 ./frame1 tracedprog2 代码后，您得到的输出如下所示（此程序将从帧描述条目（FDE）中解析呼叫信息条目（CIE）信息）;以下是函数do_stuff的框架信息，因为这是这个问题的焦点。我找到了一种更好的方法来输出数据readelf -w ./tracedprog2

00000088 000000000000001c 0000005c FDE cie=00000030 pc=0000000000001135..0000000000001178
  DW_CFA_advance_loc: 1 to 0000000000001136
  DW_CFA_def_cfa_offset: 16
  DW_CFA_offset: r6 (rbp) at cfa-16
  DW_CFA_advance_loc: 3 to 0000000000001139
  DW_CFA_def_cfa_register: r6 (rbp)
  DW_CFA_advance_loc: 62 to 0000000000001177
  DW_CFA_def_cfa: r7 (rsp) ofs 8
  DW_CFA_nop
  DW_CFA_nop
  DW_CFA_nop

根据 DWARF5 书中的描述，

15. DW_CFA_def_cfa takes two unsigned LEB128 arguments representing a
register number and an offset. The required action is to define the
current CFA rule to use the provided register and offset.
16. DW_CFA_def_cfa_register takes a single unsigned LEB128 argument
representing a register number. The required action is to define the
current CFA rule to use the provided register (but to keep the old
offset).
17. DW_CFA_def_cfa_offset takes a single unsigned LEB128 argument
representing an offset. The required action is to define the current CFA
rule to use the provided offset (but to keep the old register).

重要的信息似乎是和的值，我认为这可能是我正在寻找的框架基础。DW_CFA_def_cfaDW_CFA_def_cfa_register

因此，要获取变量的位置，我认为这是需要做的：my_local

首先，CFA 的定义见。接下来，是，哪个让它？从那里，有，这似乎表明我需要像这样添加，才能制作它。然后，使用值更改为，因此现在是。从这里，添加 of 变量以获取 .但是，我看到，它是 .RSP + 8DW_CFA_def_cfaDW_CFA_offsetcfa - 16RSP - 8DW_CFA_def_cfa_offset: 16RSP - 8 + 16RSP + 8DW_CFA_def_cfa_register: r6 (rbp)RSPRBPRBP + 8DW_OP_fbreg: -24my_localRBP - 0x10objdump-0x14(%rbp),%eax

0000000000001135 <do_stuff>:
    1135:       55                      push   %rbp
    1136:       48 89 e5                mov    %rsp,%rbp
    1139:       48 83 ec 20             sub    $0x20,%rsp
    113d:       89 7d ec                mov    %edi,-0x14(%rbp)
    1140:       8b 45 ec                mov    -0x14(%rbp),%eax
    1143:       83 c0 02                add    $0x2,%eax
    1146:       89 45 f8                mov    %eax,-0x8(%rbp)

我相信我能够找到计算局部变量位置所需的所有必要信息，但似乎我在某处遗漏了一些东西。谁能告诉我我可能错过了什么？先谢谢你。

C 变量调试 dwarf

0赞 Jay 5/31/2023

我可能忘记了在计算中考虑的一件事是......那么这可能是制作它所需的最后一个偏移量（4）？然而，即使在阅读了 DWARF 这本书之后，我也不确定这个值到底代表什么。DW_CFA_advance_loc$0x14

答：

0赞 Jay 6/2/2023 #1

因此，似乎我误解了向我展示的内容，从而导致了我上面所说的错误结论。objdump -S ./tracedprog2

例如，使用 DWARF 信息进行转储将显示反汇编 + 源代码，如下所示：

void do_stuff(int my_arg)                                                                                                                                     
{                                                                                                                                                             
    1135:       55                      push   %rbp                                                                                                           
    1136:       48 89 e5                mov    %rsp,%rbp                                                                                                      
    1139:       48 83 ec 20             sub    $0x20,%rsp                                                                                                     
    113d:       89 7d ec                mov    %edi,-0x14(%rbp)
    int my_local = my_arg + 2;                                                 
    1140:       8b 45 ec                mov    -0x14(%rbp),%eax
    1143:       83 c0 02                add    $0x2,%eax
    1146:       89 45 f8                mov    %eax,-0x8(%rbp)
    int i;                                                                     
                                                                               
    for (i = 0; i < my_local; ++i)
    1149:       c7 45 fc 00 00 00 00    movl   $0x0,-0x4(%rbp)
    1150:       eb 1a                   jmp    116c <do_stuff+0x37>
        printf("i = %d\n", i);                                                 
    1152:       8b 45 fc                mov    -0x4(%rbp),%eax
    1155:       89 c6                   mov    %eax,%esi
    1157:       48 8d 3d a6 0e 00 00    lea    0xea6(%rip),%rdi        # 2004 <_IO_stdin_used+0x4>
    115e:       b8 00 00 00 00          mov    $0x0,%eax
    1163:       e8 c8 fe ff ff          callq  1030 <printf@plt>
    for (i = 0; i < my_local; ++i)

正如你所看到的，就在线的正上方my_local

1140: 8b 45 ec mov -0x14(%rbp),%eax

这让我相信我试图找到的偏移计算。-0x14(%rbp)

现在我想我对阅读许多其他来源后发生的事情有了很好的了解，这些来源需要一点时间才能找到（如果有人想验证我的答案，我将在下面引用它们）。

长话短说，让我扩展一下上面显示的信息，看看我是否可以澄清我的理解以及我如何能够找到解决方案：

00000000 0000000000000014 00000000 CIE                                                                                                                        
  Version:               1                                                                                                                                    
  Augmentation:          "zR"                                                                                                                                 
  Code alignment factor: 1                                                                                                                                    
  Data alignment factor: -8                                                                                                                                   
  Return address column: 16                                                                                                                                   
  Augmentation data:     1b                                                                                                                                   
  DW_CFA_def_cfa: r7 (rsp) ofs 8                                                                                                                              
  DW_CFA_offset: r16 (rip) at cfa-8                                                                                                                           
  DW_CFA_undefined: r16 (rip) 
...
00000088 000000000000001c 0000005c FDE cie=00000030 pc=0000000000001135..0000000000001178
  DW_CFA_advance_loc: 1 to 0000000000001136
  DW_CFA_def_cfa_offset: 16
  DW_CFA_offset: r6 (rbp) at cfa-16
  DW_CFA_advance_loc: 3 to 0000000000001139
  DW_CFA_def_cfa_register: r6 (rbp)

Contents of the .debug_loc section:

    Offset   Begin            End              Expression
    00000000 0000000000001178 0000000000001179 (DW_OP_breg7 (rsp): 8)
    00000014 0000000000001179 000000000000117c (DW_OP_breg7 (rsp): 16)
    00000028 000000000000117c 000000000000118c (DW_OP_breg6 (rbp): 16)
    0000003c 000000000000118c 000000000000118d (DW_OP_breg7 (rsp): 8)
    00000050 <End of list>
    00000060 0000000000001135 0000000000001136 (DW_OP_breg7 (rsp): 8)
    00000074 0000000000001136 0000000000001139 (DW_OP_breg7 (rsp): 16)
    00000088 0000000000001139 0000000000001177 (DW_OP_breg6 (rbp): 16)
    0000009c 0000000000001177 0000000000001178 (DW_OP_breg7 (rsp): 8)
    000000b0 <End of list>

上述附加信息可以通过编译而不是默认来找到（来源：https://blog.tartanllama.xyz/writing-a-linux-debugger-variables/）。DWARF2DWARF5

好的，所以首先，CFA 注册最初设置为（来源：https://lists.dwarfstd.org/pipermail/dwarf-discuss/2010-August/000915.html）。rsp + 8

然后到达地址的帧时，我们在表中插入一个新行（因此，这就是值 1 所代表的）。现在，由于语句，此列的偏移量将为 16。do_stuff0x1135FDE0x1136DF_CFA_def_cfa_offset

这是什么意思？而不是像我们之前看到的那样，现在从这一行结束到现在，它将是 .rsp + 80x1136rsp + 16

因此，接下来，我们在将 3 添加到当前地址后创建一个新行（例如，从这里开始，我们将 CFA 寄存器定义为 .因为我们直到现在才改变偏移量，所以这一切意味着从现在开始，它将代替 .0x1139rbp0x1139rbp + 16rsp + 16

基本上，就是这样，我们正在寻找的用于计算局部变量的帧基是。现在我们看一下.debug_loc部分的内容，结果似乎与我上面的解释一致。my_localrsp + 16

现在回到

 <2><b3>: Abbrev Number: 7 (DW_TAG_variable)
    <b4>   DW_AT_name        : (indirect string, offset: 0x0): my_local
...
    <bb>   DW_AT_type        : <0x57>
    <bf>   DW_AT_location    : 2 byte block: 91 68      (DW_OP_fbreg: -24)

并且有价值，您只需将其添加到我们发现的框架基础中，因此这意味着 = ，这将是等价的。DW_OP_fbreg: -24rbp + 16 - 24rbp - 8mov %eax, -0x8(%rbp)

现在我又看了 Bendersky 先生的帖子，这似乎与他的回答一致，但我不知何故错过了它，显然开始 DWARF 版本 5，似乎默认不包括在内，这误导了我最初追求错误的结论。.debug_loc

我希望这个解决方案是正确的（我认为它对我来说很有意义），如果它不正确，请告诉我，因为我仍然不确定 DWARF（对于像我这样的新手来说非常复杂）。

上一个：C编程中未初始化的全局静态变量的存储

下一个：就内存和处理时间而言，C 中的文字和变量有什么区别？

如何正确使用 libdwarf 信息获取局部变量位置

How to properly use the libdwarf information to get the local variable location

评论