提问人:Jin 提问时间:10/9/2022 最后编辑:philipxyJin 更新时间:11/2/2022 访问量:163
IEEE 标准浮点标准中的非正态数
Denormal Numbers in IEEE std Floating point standard
问:
我认为正确的是,在单个浮点和正常情况下,最小值将是 (以ABS值表示)
1.0 × 2-126
但是在非正态情况下(当指数为 000...0 时),较小的值可以表示为
2-23 × 2-127 = 2-150
分数中只有一个位是 1,即 2-23。
所以我认为它可以代表一个较小的数字。
但我不明白“允许逐渐下溢,精度递减”的含义
在不正态数字中。
我认为逐渐下溢意味着所代表的数字越来越接近“0”,
精度不会改变......
答:
2-23 × 2-127 = 2-150
这是不正确的。在 IEEE-754 binary32 格式中,指数代码 1 到 254 的偏差为 127,因此指数代码 e 表示指数 E = e−127。这种偏差不适用于 0 或 255 的指数代码。指数码 0 表示指数 −126,与 1 的码相同。它进一步表示有效数的前导位为零。(指数代码 255 表示无穷大和 NaNs。
因此,最小的可表示正数具有最小的有效数 (2−23) 和最小的指数 (−126),其值为 2−23•2−126 = 2−149。
但我不明白非正态数字中“允许逐渐下溢,精度递减”的含义。
在以 b 为基数的浮点格式的正常范围内,数字用符号、半开区间 [1, b] 中 p 的 p(表示精度)基数 b 位的有效数和 b 的幂表示,其中 E 是为格式指定的某个区间内的整数。
Abrupt underflow occurs if there are no other non-zero finite numbers in the format: There is no non-zero number lower than the smallest normal number. This is abrupt because the format goes directly from full precision with the lowest normal exponent to no precision with zero. Further, if we subtract two small numbers, say subtracting 1.00002•2−126 from 1.00012•2−126 in the binary32 format without subnormals, then the computed result is 0 because there is no closer representable value to the real-number-arithmetic result, which would be 0.00012•2−126 = 2−130.
Specifying subnormal numbers in the format gives gradual underflow. After the normal numbers with the exponent −126 and full-precision 24-bit significands, we have numbers in [2−127, 2−126) with 23-bit significands, then numbers in [2−128, 2−127) with 22-bit significands, then numbers in [2−129, 2−128) with 21-bit significands, and so on until the number 2−149 with a 1-bit significand. With these in the format, the floating-point format has the property that guarantees .x != y
x-y != 0
Notes
These are called subnormal numbers. A number is subnormal if it is below the normal range of the format; it is below (“sub”) the normal numbers. A representation of a number is denormal if it is not in the normal format. The normal format uses significands in [1, b). The IEEE-754 formats have only one representation of each number, but the decimal formats may have multiple representations of some numbers. For example, 370 might be represented as 3.70•102, which has its significand in [1, 10), or it might be represented as 0.37•103, which does not have its significand in the normal interval. So 0.37•103 has a denormalized significand even though the value it represents, 370, is in the range where there are normal representations for numbers.
The normal interval used for significands is arbitrary. It may be chosen as [1, b), [1/b, 1), [bp−1, bp) as desired with corresponding adjustments to the exponent interval.
评论
@x
x
@