序列化和反序列化 C++ libtorch 张量(stringstream -> string -> char* -> stringstream)的问题

Issues with serializing and deserializing C++ libtorch tensors (stringstream -> string -> char* -> stringstream)

提问人:pregenRobot 提问时间:7/8/2023 最后编辑:pregenRobot 更新时间:7/8/2023 访问量:100

问:

顶级域名:

使用 and on 可以很好地保存和加载张量。但是,将 转换为 a 然后转换为 a 以通过 TCP 套接字发送会导致在反序列化期间出现消息中断。可能是什么原因?我用错了吗?torch::savetorch::loadstd::stringstreamstd::stringstreamstd::stringchar*

详:

我正在做一个项目,需要通过 tcp 套接字传输张量,并且需要序列化张量。最初,我有如下所示的代码来传输 Tensor:

// CLIENT

  while (this->keep_alive.load()) {
    this_thread::sleep_for(chrono::seconds(5));
    auto tensor = torch::ones({3, 4});
    stringstream stream;
    torch::save(tensor, stream);
    
    string stream_str = stream.str();
    const char* stream_str_char_ptr = stream.c_str()
    if(::send(client_fd, stream_str_char_ptr, stream_str.size(), 0) < 0){
      // Handle cases
    }
  }

// SERVER
  while (this->keep_alive.load()) {
    size_t read_so_far = 0;
    while (read_so_far < this->read_size) {
      ssize_t valread =
          ::read(this->new_socket, this->read_buffer + read_so_far,
                 this->read_size - read_so_far);
      read_so_far += valread;
    };

    torch::Tensor input_tensor;
    stringstream buffer_stream(this->read_buffer);
    torch::load(input_tensor, buffer_stream);
    cout << input_tensor << endl;
  }

现在,此代码在服务器端引发以下错误,我假设此消息意味着序列化流已损坏:

libc++abi: terminating with uncaught exception of type c10::Error: istream reader failed: getting the current position.
Exception raised from validate at /Users/runner/work/pytorch/pytorch/pytorch/caffe2/serialize/istream_adapter.cc:32 (most recent call first):
frame #0: c10::Error::Error(c10::SourceLocation, std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> >) + 81 (0x1070a4ca1 in libc10.dylib)
frame #1: c10::detail::torchCheckFail(char const*, char const*, unsigned int, std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > const&) + 98 (0x1070a3342 in libc10.dylib)
frame #2: caffe2::serialize::IStreamAdapter::validate(char const*) const + 124 (0x1231bd7bc in libtorch_cpu.dylib)
frame #3: caffe2::serialize::IStreamAdapter::size() const + 65 (0x1231bd6b1 in libtorch_cpu.dylib)
frame #4: caffe2::serialize::PyTorchStreamReader::init() + 99 (0x1231b81c3 in libtorch_cpu.dylib)
frame #5: caffe2::serialize::PyTorchStreamReader::PyTorchStreamReader(std::__1::basic_istream<char, std::__1::char_traits<char> >*) + 184 (0x1231b8b68 in libtorch_cpu.dylib)
frame #6: torch::jit::import_ir_module(std::__1::shared_ptr<torch::jit::CompilationUnit>, std::__1::basic_istream<char, std::__1::char_traits<char> >&, c10::optional<c10::Device>, std::__1::unordered_map<std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> >, std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> >, std::__1::hash<std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > >, std::__1::equal_to<std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > >, std::__1::allocator<std::__1::pair<std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > const, std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > > > >&, bool, bool) + 529 (0x12467e241 in libtorch_cpu.dylib)
frame #7: torch::jit::import_ir_module(std::__1::shared_ptr<torch::jit::CompilationUnit>, std::__1::basic_istream<char, std::__1::char_traits<char> >&, c10::optional<c10::Device>, bool) + 75 (0x12467df6b in libtorch_cpu.dylib)
frame #8: torch::jit::load(std::__1::basic_istream<char, std::__1::char_traits<char> >&, c10::optional<c10::Device>, bool) + 147 (0x124681c73 in libtorch_cpu.dylib)
frame #9: torch::serialize::InputArchive::load_from(std::__1::basic_istream<char, std::__1::char_traits<char> >&, c10::optional<c10::Device>) + 28 (0x124ef814c in libtorch_cpu.dylib)
frame #10: void torch::load<at::Tensor, std::__1::basic_stringstream<char, std::__1::char_traits<char>, std::__1::allocator<char> >&>(at::Tensor&, std::__1::basic_stringstream<char, std::__1::char_traits<char>, std::__1::allocator<char> >&) + 94 (0x106e3de0e in baton)
frame #11: baton::OutputClient::thread_handler() + 369 (0x106e42cb1 in baton)
frame #12: decltype(*(static_cast<baton::OutputClient*>(fp0)).*fp()) std::__1::__invoke<void (baton::OutputClient::*)(), baton::OutputClient*, void>(void (baton::OutputClient::*&&)(), baton::OutputClient*&&) + 105 (0x106e4d7f9 in baton)
frame #13: void std::__1::__thread_execute<std::__1::unique_ptr<std::__1::__thread_struct, std::__1::default_delete<std::__1::__thread_struct> >, void (baton::OutputClient::*)(), baton::OutputClient*, 2ul>(std::__1::tuple<std::__1::unique_ptr<std::__1::__thread_struct, std::__1::default_delete<std::__1::__thread_struct> >, void (baton::OutputClient::*)(), baton::OutputClient*>&, std::__1::__tuple_indices<2ul>) + 62 (0x106e4d73e in baton)
frame #14: void* std::__1::__thread_proxy<std::__1::tuple<std::__1::unique_ptr<std::__1::__thread_struct, std::__1::default_delete<std::__1::__thread_struct> >, void (baton::OutputClient::*)(), baton::OutputClient*> >(void*) + 98 (0x106e4cf42 in baton)
frame #15: _pthread_start + 125 (0x7ff8149e14e1 in libsystem_pthread.dylib)
frame #16: thread_start + 15 (0x7ff8149dcf6b in libsystem_pthread.dylib)

因此,我尝试在客户端进行序列化 -> 字符串转换 -> char* 转换,以确保问题不是由我这边的代码引起的。

//CLIENT
  while (this->keep_alive.load()) {
    this_thread::sleep_for(chrono::seconds(5));
    auto tensor = torch::ones({3, 4});
    stringstream stream;
    torch::save(tensor, stream);

    torch::Tensor load_tensor;
    string stream_str = stream.str();
    const char *stream_str_char_ptr = stream_str.c_str();

    stringstream load_stream = stringstream(stream_str_char_ptr);
    torch::load(load_tensor, load_stream);
    cout << load_tensor << endl;
  }

但是,这仍然引发了我上面粘贴的相同错误。建议问题在于将 转换为 a 然后转换为 .我将不胜感激有关如何解决此问题的一些建议。std::stringstreamstd::stringchar *

客户端和服务器都在 x86_64 MacOS 上本地运行。

C++ 序列化 pytorch stringstream libtorch

评论

0赞 user7860670 7/8/2023
看起来你假设它产生一个 C 样式的字符串,而它可能是一个二进制。torch::save(tensor, stream);

答: 暂无答案