提问人:Yeshe Tenley 提问时间:5/17/2020 最后编辑:Yeshe Tenley 更新时间:5/22/2020 访问量:661
求满足浮点不等式的最小整数
Find smallest integer that satisfies floating point inequality equation
问:
我正在寻找一种快速算法,该算法可以找到最小的整数 N,该算法将满足以下不等式,其中 、 、 和 是数字(使用 IEEE-754 binary32 格式):s
q
u
p
float
s > q + u * p / (N - 1)
其中 N 可以是任何由有符号 32 位整数表示的正整数。转换为 后,所有算术都在 中计算。(N - 1)
float
float
其他限制包括:
- 0 < < 1.
p
- -1 ≤ ≤ 1.
q
q
<s
.- 0 < .
u
我很难弄清楚如何以一种稳健的方式做到这一点,以正确处理浮点舍入错误和比较。这是我对一个不快的解决方案的糟糕尝试,甚至不健壮,因为我无法确定最小值:SOME_AMOUNT
int n = std::max(1.0f, floorf((u * p / (s - q)) - 1.0f));
// Floating point math might require to round up by some amount...
for (int i = 0; i < SOME_AMOUNT; ++i)
if (!(q + (u * p / (n + 1)) < second))
++n;
你可以在上面看到我使用基本代数计算的公式。for 循环是我试图解释浮点舍入错误的粗略方法。我正在用蛮力检查它,如下所示:n
int nExact = 0;
bool found = false;
for (; nExact < SOME_BIG_NUMBER; ++nExact) {
if (q + (u * p / (nExact + 1)) < second) {
found = true;
break;
}
}
assert(found);
assert(n == nExact);
任何浮点大师在 C++ 中都有相当快的答案吗?
坦率地说,如果有人能给出上面“SOME_AMOUNT”上限的理论证明,我会相当高兴......
答:
为了安全起见,我们可以首先得到一个更大的可能值(上限)和一个更小的可能值(下限),然后将其简化为我们的实际答案,这样它就会比只迭代数字更准确、更快。
通过解决我们得到的不平等,
N > u * p / (s - q) + 1
获取上限
因此,您将首先使用整数找到一个最大猜测的答案。我们将增加分子和整数铸分母
int UP = (int)(u * p + 1); // Increase by one
int D = (int)(s - q); // we don't increase this because it would cause g to decrease, which we don't want
float g = UP / (float)D + 1; // we again float cast D to avoid integer division
int R = (int)(g + 1); // Now again increase g
/******** Or a more straight forward approach ********/
int R = (int)(((int)(u*p+1))/(s-q) + 1 + 1)
// Add rounding-off error here
if(R + 128 < 0) R = 2147483647; // The case of overflow
else R += 128;
这是您的最大答案(上限)。
获得下限
和以前一样,但这次我们将增加分母和整数转换分子
int UP = (int)(u * p); // will automatically decrease
int D = (int)(s - q + 1); // we increase this because it would cause g to decrease, which we want
float g = UP / (float)D + 1; // we again float cast D to avoid integer division
int L = (int)g; // Integer cast, will automatically decrease
/******** Or a more straight forward approach ********/
int L = (int)(((int)(u*p))/(s-q+1) + 1)
// Subtract rounding-off error
if(L - 128 <= 1 ) L = 2; // N cannot be below 2
else L -= 128;
这是您的最低答案(下限)。
注意:整数转换的原因是为了减少我们的样本空间。如果您觉得如此,可以省略它。
消除可能的数字并获得正确的数字
for (int i = L; i <= R; ++i){
if ((s > q + u*p/(i-1))) break; // answer would be i
}
N = i; // least number which satisfies the condition
如果边界间隙 (R-L) 很大,则可以使用二叉搜索更快地完成此操作。至于差值为 2^n 的数范围,只需 n 步即可减少。
// we know that
// lower limit = L;
// upper limit = R;
// Declare u, p, q, s in global space or pass as parameters to biranySearch
int binarySearch(int l, int r)
{
if(l==r) return l;
if (r > l) {
int mid = l + (r - l) / 2;
bool b = (s > q + (p*u)/(mid-1));
if (b==true){
// we know that numbers >= mid will all satisfy
// so our scope reduced to [l, mid]
return binarySearch(l, mid);
}
// If mid doesn't satisfy
// we know that our element is greater than mid
return binarySearch(mid+1, r);
}
}
int main(void)
{
// calculate lower bound L and upper bound R here using above methods
int N = binarySearch(L, R);
// N might have rounding-off errors, so check for them
// There might be fluctuation of 128 [-63 to 64] so we will manually check.
// To be on safe side I will assume fluctuation of 256
L = N-128 > 2 ? N-128 : 2;
R = N+128 < 0 ? 2147483647 : N+128;
for(int i=L; i<=R; ++i){
if( s > q + u * p / ((float)i - 1)) {
break;
}
}
cout << i << endl;
}
它主要是一个概念,但它既快速又安全。唯一的问题是我还没有测试过它,但它应该可以工作!
评论
[2^29+1, 2^30+3] => [536870913, 1073741827]
这是解决方案的开始。一些注意事项:
- 它是用 C 语言,而不是 C++。
- 它假定 IEEE-754 算术,四舍五入到最接近。
- 它不处理不等式要求 N 超出从 2 到 的边界的情况。
INT_MAX
- 我没有测试太多。
该代码首先使用浮点运算来估计不等式变化的边界的位置,忽略舍入误差。它测试不等式,看看它是否需要增加或减少候选值。然后,它遍历连续的整数值以查找边界。我的感觉是这需要几次迭代,但我还没有完全分析它。float
这会产生一个整数值的最小值,当用于代替分母时,该整数值满足不等式。然后,代码找到舍入到该的最小值,并且应该是满足不等式的最小值。float
N-1
int N
N-1
float
N
int
#include <math.h>
#include <stdio.h>
#include <stdlib.h>
// Test the inequality.
static int Test(float s, float q, float u, float p, int N)
{
return s > q + (float) (((float) (u * p)) / (N-1));
}
int main(void)
{
float s = 1;
float q = 0;
float u = 0x1p30, p = 1;
/* Approximate the desired denominator (N-1) -- would be exact with real
arithmetic but is subject to rounding errors.
*/
float D = floorf(u*p/(s-q));
// Test which side of the boundary where the inequality changes we are on.
if (Test(s, q, u, p, (int) D + 1))
{
// We are above the boundary, decrement find the boundary.
float NextD = D;
do
{
D = NextD;
// Decrement D by the greater of 1 or 1 ULP.
NextD = fminf(D-1, nexttowardf(D, 0));
}
while (Test(s, q, u, p, (int) NextD + 1));
}
else
// We are below the boundary, increment to find the boundary.
do
// Increment D by the greater of 1 or 1 ULP.
D = fmaxf(D+1, nexttowardf(D, INFINITY));
while (!Test(s, q, u, p, (int) D + 1));
// Find the distance to the next lower float, as an integer.
int distance = D - nexttowardf(D, 0);
/* Find the least integer that rounds to D. If the distance to the next
lower float is less than 1, then D is that integer. Otherwise, we want
either the midpoint between the D and the next lower float or one more
than that, depending on whether the low bit of D in the float
significand is even (midpoint will round to it, so use midpoint) or odd
(midpoint will not round to it, so use one higher).
(int) D - distance/2 is the midpoint.
((int) D / distance) & 1 scales D to bring the low bit of its
significand to the one’s position and tests it, producing 0 if it is
even and 1 if it is odd.
*/
int I = distance == 0 ? (int) D
: (int) D - distance/2 + (((int) D / distance) & 1);
// Set N to one more than that integer.
int N = I+1;
printf("N = %d.\n", N);
if (Test(s, q, u, p, N-1) || !Test(s, q, u, p, N))
{
fprintf(stderr, "Error, solution is wrong.\n");
exit(EXIT_FAILURE);
}
}
评论
s > q + u * p / (N - 1)
N
N
p
q
u
s
if()
else
s > q + u * p / (N - 1)
s > q
u
q
u * q / (2-1)
s > q + u * q / (2-1)
u
p
u * q
x
s > q + x / (N-1)
N
N
float
float
s-q
float
float n
s > q + x/n
(s-q)/x
n
N
N-1