程序集浮点数学-解网

问：

我正在制作一个编译为x86_64汇编的编译器，并且我正在尝试实现浮点数学。然而，我很快就遇到了问题，因为浮点在组装中很困难，所以我试图用它们练习，但我没有找到运气。

在这个程序中，我想将两个浮点数加在一起，将它们与另一个浮点数进行比较，如果它们相等，则打印一条消息，但未打印任何内容。

section .data
    float1: dd 3.14
    float2: dd 5.72
    cmp_float: dd 8.86
    msg: db "Is equal!", 10, 0
    msg_len equ $-msg

section .bss
    result: resd 1

section .text
    global _start

_start:
    fld dword [float1]  ; Load float1 into FPU stack
    fld dword [float2]  ; Load float2 into FPU stack
    faddp               ; Add two top floats of the FPU stack and push back onto the FPU stack

    fcomp dword [cmp_float] ; Compare with the top value of the FPU stack

    fstsw ax            ; Store FPU status word in AX register
    sahf                ; Move AH register to FLAGS register

    je .equal
    jmp .exit
.equal:
    mov rax, 1
    mov rdi, 1
    mov rsi, msg
    mov rdx, msg_len
    syscall
.exit:
    mov rax, 60
    mov rdi, 0
    syscall

数学程序集浮点

section .data
    number1 dd 2.5          ; First floating-point number
    number2 dd 1.3          ; Second floating-point number
    result_add dd 0.0       ; Variable to store the addition result
    result_sub dd 0.0       ; Variable to store the subtraction result
    result_mul dd 0.0       ; Variable to store the multiplication result
    result_div dd 0.0       ; Variable to store the division result

section .text
    global _start

_start:
    ; Addition
    movss xmm0, [number1]       ; Load number1 into xmm0
    movss xmm1, [number2]       ; Load number2 into xmm1
    addss xmm0, xmm1             ; Add xmm1 to xmm0
    movss [result_add], xmm0    ; Store the result

    ; Subtraction
    movss xmm0, [number1]       ; Load number1 into xmm0
    movss xmm1, [number2]       ; Load number2 into xmm1
    subss xmm0, xmm1             ; Subtract xmm1 from xmm0
    movss [result_sub], xmm0    ; Store the result

    ; Multiplication
    movss xmm0, [number1]       ; Load number1 into xmm0
    movss xmm1, [number2]       ; Load number2 into xmm1
    mulss xmm0, xmm1             ; Multiply xmm0 by xmm1
    movss [result_mul], xmm0    ; Store the result

    ; Division
    movss xmm0, [number1]       ; Load number1 into xmm0
    movss xmm1, [number2]       ; Load number2 into xmm1
    divss xmm0, xmm1             ; Divide xmm0 by xmm1
    movss [result_div], xmm0    ; Store the result

    ; ... Rest of your code

    ; Exit the program
    mov rax, 60
    mov rdi, 0
    syscall

你想要和对于标量运算，比如 / / 。查看编译器输出以获取工作示例（如何从 GCC/clang 程序集输出中删除“噪音”？是一个 16 字节的加载或存储，覆盖 12 个字节的末尾。你的代码之所以碰巧起作用，是因为你按照它们在内存中的布局顺序编写它们，所以你的越界写入只踩到了你已经要写入的位置。subpsmovssmovss xmm0, [number1]subss xmm0, [number2]movss [result_sub], xmm0movupsresult_sub

0赞 Peter Cordes 6/22/2023

此外，还可以使用更高效的 RIP 相对寻址模式。default rel[number1]

0赞 joshjkk 6/22/2023

@PeterCordes知道了，谢谢你的澄清。

0赞 Peter Cordes 6/22/2023

哎呀，我的第一条评论中有错别字。你想要（标量单）而不是（打包单）。你已经想通了，但只是为未来的读者澄清这是一个错别字。subSSsubPS

0赞 Sam Mason 6/23/2023

godbolt.org 可以是一个有用的资源，可以查看现有编译器生成的代码类型（具有指向机器代码文档的良好链接）。godbolt.org/z/dqMh3Wa5Y 来自我最近写的一条评论，导致 GCC 使用 SSE 指令

上一个：Python ROUND_HALF_EVEN的工作原理

下一个：为什么所有带有几位数字的小数点都能正确打印？

程序集浮点数学

Assembly floating point math

评论

评论