提问人:Kiro 提问时间:11/10/2023 最后编辑:Kiro 更新时间:11/10/2023 访问量:118
如何在 c++ 中读取二进制文件中的小端整数
How to read little endian integers in binary files in c++
问:
我一直在尝试读取一个小的字节序二进制文件。我希望首先在文件中找到一个字符串的索引,从那里我可以开始读取数据。一旦我得到索引,我将寻求提取学生信息。我已经成功地阅读了字符串(名字和姓氏)。但是,我无法提取整数(年龄)。感谢您的帮助:) 下面是二进制文件中学生信息部分的示例
130000004A4F484E444F45
#include <iostream>
#include <fstream>
#include <string>
using namespace std;
class BinaryData
{
private:
string age;
string firstName;
string lastName;
public:
BinaryData(){}
~BinaryData(){}
void readStudentData(string fileName)
{
//open the file and get its size
ifstream in(fileName.c_str(), ios::in | ios::binary);
in.seekg(0, ios::end);
int fileSize = in.tellg();
in.seekg(0, ios::beg);
//read the entire file into a buffer
string studentBuffer;
studentBuffer.resize(fileSize);
in.read(&studentBuffer[0], studentBuffer.size());
//move to the position where to start reading student info
int index = studentBuffer.find("StudentInfo");
if(index != 0 && index != std::string::npos)
{
//read the student info if the header is found
age = studentBuffer.substr(index, 4); //assume we are dedicating 4 bytes for age
index += 4;
firstName = studentBuffer.substr(index, 10); //assume we are dedicating 10 bytes for first name
index += 10;
lastName = studentBuffer.substr(index, 10); //assume we are dedicating 10 bytes for last name
std::cout<< "The student name is: " << firstName << " " << lastName << ", they are " << age << " years old" << std::endl;
}
in.close();
}
}
我的代码的输出将是:
The student name is: John Doe, they are **(some character)** years old
预期输出应为:
The student name is: John Doe, they are 19 years old
年龄数据如下所示:二进制文件中的 13 00 00 00,应转换为 19,因为这是小端序。
任何帮助将不胜感激。谢谢!
答:
0赞
Maarten Bodewes
11/10/2023
#1
下面是与系统无关的整数转换代码:
#include <iostream>
#include <string>
#include <cstdint>
#include <stdexcept>
/**
* Converts a string containing binary data into a 64-bit unsigned integer.
* The string is interpreted in little endian format.
* If the string contains fewer than 1 byte or more than 8 bytes, an exception is thrown.
*
* @param binary_data The string containing the binary data.
* @return The 64-bit unsigned integer representation of the binary data.
*/
uint64_t convertToUInt64(const std::string& binary_data) {
if (binary_data.empty()) {
throw std::invalid_argument("Input string must contain at least 1 byte.");
}
if (binary_data.size() > sizeof(uint64_t)) {
throw std::invalid_argument("Input string must contain no more than 8 bytes.");
}
uint64_t value = 0;
// Iterate over the string to construct the integer
for (size_t i = 0; i < binary_data.size(); ++i) {
// Cast each character to an unsigned byte and then shift it to the correct position
value |= static_cast<uint64_t>(static_cast<uint8_t>(binary_data[i])) << (i * 8);
}
return value;
}
您可以使用示例二进制数据在 a 中调用它:string
int main() {
// Binary data in a string, representing the value 19 in 3 bytes (little endian)
std::string binary_data = "\x13\x00\x00";
// Convert the binary data to a 64-bit unsigned integer
uint64_t value = convertToUInt64(binary_data);
// Output the integer value
std::cout << "The 64-bit unsigned integer value is: " << value << std::endl;
return 0;
}
我展示了一个泛型函数,而不是一个只接受 3 个字节的函数,以防还使用其他大小的字符串。如果想要较小的整数,可以检查整数是否低于特定大小,然后进行赋值。
1赞
selbie
11/10/2023
#2
一种可移植的方式,用于读取 2 的补码架构的小端序(这几乎是每台现代计算机)。
将 4 个字节:(如下所示)从字符缓冲区中读取为整数。13 00 00 00
uint32_t age;
memcpy(&age, studentBuffer.c_str()+index, 4);
index += 4;
如果您使用的是英特尔,那么您可能已经完成了。由于英特尔处理器是 Little Endian。 将具有预期值。或者在本例中,0x13 == 十进制 19。age
如果你想让你的代码在 big-endian 和 little-endian 硬件上运行,你可以这样做:
if (isBigEndian()) {
// swap bytes
uint32_t b1 = (age >> 24) & 0x000000ff;
uint32_t b2 = (age >> 8) & 0x0000ff00;
uint32_t b3 = (age << 8) & 0x00ff0000;
uint32_t b4 = (age << 24) & 0xff000000;
age = b1|b2|b3|b4;
}
哪里可以写成如下:isBigEndian()
bool isBigEndian() {
uint8_t buffer[4] = {0};
uint32_t t = 1;
memcpy(buffer, &t, 4);
return (buffer[0] == 0);
}
评论
0赞
selbie
11/10/2023
更正了一些错别字。希望这会有所帮助。
0赞
Kiro
11/10/2023
这奏效了!谢谢你帮我解决这个问题:)我在 LE 系统上运行,所以 memcpy 做到了!
1赞
Vlad Feinstein
11/10/2023
#3
捷径。
我假设(根据我的口味,非常安全)学生的年龄是一个小于 255 的正数,所以它适合一个字节。
只需读取该字节即可。
评论
13 00 00 00
每次数都是四个字节。不知道你从哪里得到三个。age = studentBuffer.substr(index, 3);
age