提问人:D0SBoots 提问时间:1/29/2023 更新时间:1/30/2023 访问量:88
计算这个“简单”浮点函数的确切逆函数
Compute the exact inverse of this "simple" floating-point function
问:
我有以下功能:
float int_to_qty(unsigned x) {
const float MAX = 8.5f;
const float MIN = .001f;
return ((MAX-MIN) / (float)(1<<24)) * x + MIN;
}
这会编译(在 x86 上使用合理的选项)为以下内容:
.LCPI0_0:
.long 0x3507fbe7 # float 5.06579852E-7
.LCPI0_1:
.long 0x3a83126f # float 0.00100000005
int_to_qty: # @int_to_qty
mov eax, edi
cvtsi2ss xmm0, rax
mulss xmm0, dword ptr [rip + .LCPI0_0]
addss xmm0, dword ptr [rip + .LCPI0_1]
ret
我认为该程序集是该函数的“规范”版本:将 int 转换为浮点数,以 32 位精度乘以一个常量,以 32 位精度添加另一个常量,这就是结果。
我想找到这个函数的确切逆函数。具体来说,一个将通过以下测试的函数:unsigned qty_to_int(float qty)
int test() {
for (unsigned i = 0; i < (1 << 24); ++i) {
float qty = int_to_qty(i);
if (int_to_qty(qty_to_int(qty)) != qty) {
return 0;
}
}
return 1;
}
笔记:
- 在 4 ≤ < 8 范围内,返回值主要相差 1 ulp,这就是具有挑战性的原因。
int_to_qty(x)
- 在 8 ≤ < 8.5 范围内,该函数不再是一对一的。在这种情况下,任何一个答案对于反答案都是可以的,它不必始终是最低或最高的。
int_to_qty(x)
答:
0赞
D0SBoots
1/30/2023
#1
经过长时间的搏斗,我终于想出了一个通过测试的解决方案。(在 Rust 中,但翻译成 C 很简单。
pub fn qty_to_int(qty: f64) -> u32 {
const MAX: f32 = 8.5;
const MIN: f32 = 0.001;
let size_inv = f64::from(1<<24) / f64::from(MAX - MIN);
// We explicitly shrink the precision to f32 and then pop back to f64, because we *need* to
// perform that rounding step to properly reverse the addition at the end of int_to_qty.
// We could do the whole thing at f32 precision, except that our input is f64 so the
// subtraction needs to be done at f64 precision.
let fsqueezed: f32 = (qty - f64::from(MIN)) as f32;
// The squeezed subtraction is a one-to-one operation across most of our range. *However*,
// in the border areas where our input is just above an exponent breakpoint, but
// subtraction will bring it below, we hit an issue: The addition in int_to_qty() has
// irreversibly lost a bit in the lowest place! This causes issues when we invert the
// multiply, since we are counting on the error to be centered when we round at the end.
//
// To solve this, we need to re-bias the error by subtracting 0.25 ulp for these cases.
// Technically this should be applied for ranges like [2.0,2.001) as well, but they
// don't need it since they already round correctly (due to having more headroom).
let adj = if qty > 4.0 && qty < 4.001 {
0.5 - 2.0 / 8.499
} else if qty > 8.0 && qty < 8.001 {
0.5 - 4.0 / 8.499
} else {
0.5
};
// Multiply and round, taking into account the possible adjustments.
let fresult = f64::from(fsqueezed) * size_inv + adj;
unsafe { fresult.to_int_unchecked() }
}
评论