如何在 64 位 NASM 中使用 malloc 和 free?

How to use malloc and free in 64-bit NASM?

提问人:RTC222 提问时间:2/8/2018 最后编辑:Michael PetchRTC222 更新时间:5/21/2021 访问量:3688

问:

在 64 位 NASM 中,我使用 C 库中的 malloc() 分配一个 8000 字节的内存块,完成后,我通过调用 free() 来解除分配它。

我的研究提出了许多关于如何在 64 位 NASM 中执行此操作的相互矛盾的信息,其中大部分信息是 32 位的,其中调用约定不同,或者它是 C 或 C++,而不是 NASM。

我认为我有 malloc 部分,但我不确定免费部分。我发布这个问题是因为我不想测试它并分配内存块但未释放。

所以我的两个问题很简单:
(1)我是否拥有 64 位 NASM 的这个权利?
(2) Windows 和 Linux 的语法是否相同?

我只显示程序的 malloc 和 free 部分:

extern malloc
extern free

push rdi

; Allocate the memory buffer
mov rdi,8000
call malloc
mov [array_pointer],rax ;array_pointer is initialized in .data

; Code that uses the buffer goes here.  

; Free the memory buffer
push rdi
call free
add rsp,8

pop rdi
ret
组装 Malloc NASM x86-64 免费

评论

3赞 Michael Petch 2/8/2018
malloc返回 RAX 中的指针。您需要将 RAX 移动到 RDI(或将存储在 RDI 的地址移动到 RDI)以释放它(因为第一个参数是通过 RDI 传递的,就像符合 64 位 System V ABI 的每个函数一样)。您也不需要 和 周围 .这样做会扰乱堆栈对齐。array_pointerfreepush rdiadd rsp, 8call free
3赞 Michael Petch 2/8/2018
C 库是标准化的。您可以在 cppreference 站点上找到所有函数及其参数。内存功能可以在这里找到: en.cppreference.com/w/c/memory .无论是从 C 调用还是程序集调用,都采用单个参数(指针),并且没有返回值。.采用单个参数(以字节为单位的大小)并返回指针(在 RAX 中)。关于cpp参考的C库的索引在这里:en.cppreference.com/w/cfreemalloc
3赞 Michael Petch 2/8/2018
通常,我的做法(没有堆栈帧,也不用担心展开)是从 RSP 中减去 40 字节 +(局部变量所需的字节数(大小四舍五入到最接近的 16))。在函数的开头。这意味着在一条指令中,我将重新对齐为 16 字节的 boiundary,分配所有本地空间,并一次性分配所有暂存空间。优点是,在我的函数中的任何时候,我都不需要担心对齐(和暂存空间),因为我已经确保在开始时分配了所有内容并堆叠对齐。然后,我当地的可负担将从 RSP+32 开始。
3赞 Michael Petch 2/8/2018
Incorect的。 返回一个 16 字节对齐的地址,因此,如果您使用需要对该指针进行对齐访问的指令(如 SSE),则它不会失败。但是,在调用和 (以及使用 Windows 64 位调用约定的任何其他一致性函数) 之前,您仍然需要确保堆栈是 16 字节对齐的,因为这些函数可能使用需要正确堆栈对齐才能正常运行的 CPU 指令。如果函数没有失败,如果堆栈未对齐,请不要假设将来会出现这种情况。mallocmallocfree
3赞 Michael Petch 2/8/2018
我经常看到人们说“它没有适当的对齐就可以工作,所以它已经足够好了”,然后当他们的代码意外开始崩溃时,他们去寻求帮助,并想知道为什么,我们必须回过头来说“对齐很重要,这是有原因的”

答:

-7赞 old_timer 2/8/2018 #1

汇编语言没有标准库。所以这不是一个汇编语言问题,一定是我有一组符合这个调用约定的库,或者由 X 编译器和版本制作的,具有这样那样的设置。我想链接并使用汇编语言中的这些库。首先,只需用该语言编写它,然后编译并保存临时或编译为汇编,然后从该代码开始。或者反汇编此类代码以发现调用约定,并将其与使用此编译器读取此目标平台的调用约定时发现的内容进行比较。

If it is a system call and you want to do that directly, and not a library call then you need to read up on the system call interface for this platform and operating system, no reason to assume any two are the same (Linux, BSD, Windows, etc). Nor that major versions of each are the same although they probably are...

then write your code to conform to whichever you found.

3赞 paxdiablo 5/21/2021 #2

Let's start with Windows x64. A single integer-sized parameter (as given to both and ) is passed in the register and an integer return value is put into the register.mallocfreercxrax

The basic rule is to use , , , and for the first four integer parameters and the stack for any others. Non-integer parameters complicate things a little but, since there are none of those in the or calls, I won't cover it here. If you need more information, Microsoft has a good article over at X64 Calling Convention.rcxrdxr8r9mallocfree

Hence simple code for allocating and immediately freeing a block would be something like, with AT&T syntax given in parentheses after comment, if different:

mov  rcx, 1000          ; Allocate a block (mov $1000, %rcx).
call malloc             ; Allocate, address returned in rax.

mov  rcx, rax           ; Address needed in rcx (mov %rax, %rcx).
call free               ; And free it.

Note that this example, and the one below, simply illustrate register usage, there's other things you need to consider as well, such as shadow space and alignment requirements.


Linux uses a different approach (though still with registers for efficiency). It uses the System V AMD64 ABI and, for this case, you'll find still used for the return value but used for the argument.raxrdi

This ABI draws its integer register set from with any extra parameters passed on the stack.{ rdi, rsi, rdx, rcx, r8, r9 }

So the code change for Linux would be fairly simple, using instaed of :rdircx

mov  rdi, 1000          ; Allocate a block (mov $1000, %rdi).
call malloc             ; Allocate, address returned in rax.

mov  rdi, rax           ; Address needed in rdi (mov %rax, %rdi).
call free               ; And free it.

Raymond Chen (of The Old New Thing fame) has a series on calling conventions that you may find interesting, starting here.

评论

2赞 Peter Cordes 5/21/2021
Don't forget that Windows x64 has shadow space, i.e. functions can step on the 32 bytes above their return address. If you used this with the code in the question, it could step on your own return address, since you omitted / before / after. (or 40 if you remove the otherwise useless push/pop of RDI - Windows and SysV both require 16-byte stack alignment. malloc/free specifically are unlikely to fail from a misaligned stack, but in general that's an important part of calling functions.) Very likely there's a duplicate somewhere...sub rsp, 32add rsp, 32
0赞 Peter Cordes 5/21/2021
One of Agner Fog's guides (agner.org/optimize/#manuals) is about calling conventions, and the differences between them on different OSes.