IEEE 标准浮点标准中的非正态数-解网

问：

我认为正确的是，在单个浮点和正常情况下，最小值将是（以ABS值表示）

1.0 × ^2-126

但是在非正态情况下（当指数为 000...0 时），较小的值可以表示为

^2-23 × ^2-127 = ^2-150

分数中只有一个位是 1，即 ^2-23。

所以我认为它可以代表一个较小的数字。

但我不明白“允许逐渐下溢，精度递减”的含义

在不正态数字中。

我认为逐渐下溢意味着所代表的数字越来越接近“0”，

精度不会改变......

浮点精度 IEEE-754 非正态数

“我认为逐渐下溢意味着所代表的数字越来越接近'0'，精度不会改变。哦？为什么？您引用权威文档的（合理/正确的）理由是什么？“逐渐下溢”是一个模糊的术语，不能单独理解为任何特别的含义。你有什么理由认为它特别孤立地意味着什么？但是，与不使用亚正常值相比，它恰好可以合理地描述您获得的结果。查看每种情况下可用的值（接近 0）。

0赞 Jin 10/9/2022

分数部分为 000...01，偏置为 000...0。所以 0.000000...1 部分是 \$2^{-23}\$，指数部分是 \$2^{-127}\$。对不起，但是我怎么能在这里展示乳胶......

1赞 philipxy 10/9/2022

请通过编辑而不是评论进行澄清。PS 请参阅评论回复如何工作？学习使用每条评论通知 1 个非唯一的非发帖者评论该评论。发帖人、唯一评论者和帖子的关注者总是会收到通知。如果没有其他评论者，则不会收到任何通知。PS Stack Overflow 用户需要多少研究工作？如何询问帮助中心 Meta Stack Overflow Meta Stack Exchange PS en.wikipedia.org/wiki/Subnormal_number@xx@

答：

1赞 Eric Postpischil 10/10/2022 #1

^2-23 × ^2-127 = ^2-150

这是不正确的。在 IEEE-754 binary32 格式中，指数代码 1 到 254 的偏差为 127，因此指数代码 e 表示指数 E = e−127。这种偏差不适用于 0 或 255 的指数代码。指数码 0 表示指数 −126，与 1 的码相同。它进一步表示有效数的前导位为零。（指数代码 255 表示无穷大和 NaNs。

因此，最小的可表示正数具有最小的有效数（2⁻²³）和最小的指数（−126），其值为 2−23•2−¹²⁶ = 2⁻¹⁴⁹。

但我不明白非正态数字中“允许逐渐下溢，精度递减”的含义。

在以 b 为基数的浮点格式的正常范围内，数字用符号、半开区间 [1， b] 中 p 的 p（表示精度）基数 b 位的有效数和 b ^的幂表示，其中 E 是为格式指定的某个区间内的整数。

Abrupt underflow occurs if there are no other non-zero finite numbers in the format: There is no non-zero number lower than the smallest normal number. This is abrupt because the format goes directly from full precision with the lowest normal exponent to no precision with zero. Further, if we subtract two small numbers, say subtracting 1.0000₂•2⁻¹²⁶ from 1.0001₂•2⁻¹²⁶ in the binary32 format without subnormals, then the computed result is 0 because there is no closer representable value to the real-number-arithmetic result, which would be 0.0001₂•2⁻¹²⁶ = 2⁻¹³⁰.

Specifying subnormal numbers in the format gives gradual underflow. After the normal numbers with the exponent −126 and full-precision 24-bit significands, we have numbers in [2⁻¹²⁷, 2⁻¹²⁶) with 23-bit significands, then numbers in [2⁻¹²⁸, 2⁻¹²⁷) with 22-bit significands, then numbers in [2⁻¹²⁹, 2⁻¹²⁸) with 21-bit significands, and so on until the number 2⁻¹⁴⁹ with a 1-bit significand. With these in the format, the floating-point format has the property that guarantees .x != yx-y != 0

Notes

These are called subnormal numbers. A number is subnormal if it is below the normal range of the format; it is below (“sub”) the normal numbers. A representation of a number is denormal if it is not in the normal format. The normal format uses significands in [1, b). The IEEE-754 formats have only one representation of each number, but the decimal formats may have multiple representations of some numbers. For example, 370 might be represented as 3.70•10², which has its significand in [1, 10), or it might be represented as 0.37•10³, which does not have its significand in the normal interval. So 0.37•10³ has a denormalized significand even though the value it represents, 370, is in the range where there are normal representations for numbers.

The normal interval used for significands is arbitrary. It may be chosen as [1, b), [1/b, 1), [b^p−1, b^p) as desired with corresponding adjustments to the exponent interval.

IEEE 标准浮点标准中的非正态数

Denormal Numbers in IEEE std Floating point standard

评论

Notes

评论