定义了与 24 位和 8 位 var 联合的行为-解网

问：

我正在尝试找到将 24 位和 8 位无符号整数打包成 32 位的最佳方法，而无需位移来提取数据。工会立即想到了一种简单的方法，如下所示：

union {
    uint32_t u24;
    uint8_t u8[4]; // use only u8[3]
}

但是，这种方法会导致基于系统字节序的未定义行为，因此我想出了以下方法，该方法使用 c++20 功能在编译时使用 std：：endian 和 constexpr 检测系统的字节序：

#include <bit>
struct UnionTest {
    union {
        uint32_t u24;
        uint8_t u8[4];
    };
    
    inline constexpr uint8_t get_u8_index() const noexcept {
        if constexpr (std::endian::native == std::endian::little) return 0;
        else if constexpr (std::endian::native == std::endian::big) return 3;
        else // crap the bed
    }
};

// use like this:
int main() {
    UnionTest test;
    test.u24 = 0xffffff;
    test.u8[test.get_u8_index()] = 0xff;
}

这可能仍然有点冗长，但这不是问题所在。我纯粹对这种方法的可行性感兴趣，假设我们从不将大于 24 位的值写入 u24。

另一种方法是使用位字段：

struct UnionTest {
    uint32_t u24 : 24;
    uint32_t u8 : 8;
}

但这可能会导致 64 位而不是 32 位（尽管在大多数情况下应该预期为 32 位）。

我的问题是 A）关于联合方法在性能和潜在未定义行为方面的可行性，以及 B）建议的联合方法与 c++ 位域的使用之间的实际区别

C++ 联合位域

C++语言允许访问任何对象上的字节表示。它显式用于允许简单可复制类型的字节复制。此外，如果定义了字节序，则可以预期 24 位值将 3 个高阶字节用于小端序，将 3 个低阶字节用于大端序。它仍然需要一个掩码来访问 24 位值，但 8 位值可以直接访问，并且从未使用过移位。

下面是一个可能的代码，演示了这一点：

#include <iostream>
#include <bit>

namespace {
    inline constexpr uint8_t get_u8_index() noexcept {
        if constexpr (std::endian::native == std::endian::little) return 3;
        else if constexpr (std::endian::native == std::endian::big) return 0;
        else {}// crap the bed
    }
}

class pack_24_8 {
    uint32_t value;

    static const int u8_index = get_u8_index();  // locally scoped constant

public:
    uint8_t get_u8() const {
        return ((const uint8_t*)(&value))[u8_index]; // extract one single byte
    }

    void set_u8(uint8_t c) {
        ((uint8_t*)(&value))[u8_index] = c;  // set one single byte
    }

    uint32_t get_u24() const {
        return value & 0xffffff;      // get the less significant 24 bits
    }

    void set_u24(uint32_t u24) {
        uint8_t u8 = get_u8();    // save the u8 part
        value = u24;
        set_u8(u8);               // and restore it
    }
};

// use like this:
int main() {
    pack_24_8 test;
    test.set_u8(0x5a);
    test.set_u24(0xa5a5a5);

    std::cout << std::hex << (unsigned int) test.get_u8() << " - " <<
        std::hex << test.get_u24() << '\n';

    return 0;
}

注意：正如@Caleth在评论中所说，这依赖于作为无符号字符的别名。 AFAIK 这适用于每个常见架构，但每个标准都不需要它......uint8_t

定义了与 24 位和 8 位 var 联合的行为

Defined behaviour for union with 24-bit and 8-bit vars

评论

评论