IO.pipe(Encoding::BINARY, Encoding::BINARY) 失败并显示 UndefinedConversionError,但仅在 Rails 下

IO.pipe(Encoding::BINARY, Encoding::BINARY) failing with UndefinedConversionError, but only under Rails

提问人:David Moles 提问时间:2/26/2021 最后编辑:David Moles 更新时间:2/26/2021 访问量:257

问:

我有一些代码用于将 http.rb 的基于块的响应体流转换为普通的 .IO

def stream_response_body(body)
  IO.pipe(Encoding::BINARY, Encoding::BINARY) do |rd, wr|
    t = copying_thread(body, wr)
    yield rd
  ensure
    t.join if t
  end
end

def copying_thread(body, dst)
  Thread.new do
    body.each { |chunk| dst.write(chunk) }
  rescue StandardError => e
    UCBLIT::TIND.logger.error(e)
  ensure
    dst.close
    Thread.exit
  end
end

当我从命令行脚本调用它时,这工作正常,但是当我从 Rails 控制器调用它时,它会爆炸: dst.write(chunk)

  Encoding::UndefinedConversionError ("\xE5" from ASCII-8BIT to UTF-8):
    /Users/david/.rvm/gems/ruby-2.7.2/bundler/gems/ucblit-tind-de599ab253cc/lib/ucblit/tind/api/api.rb:106:in `write'
    /Users/david/.rvm/gems/ruby-2.7.2/bundler/gems/ucblit-tind-de599ab253cc/lib/ucblit/tind/api/api.rb:106:in `block (2 levels) in copying_thread'

(Script 和 Rails 应用程序都在 macOS Catalina 上的 Ruby 2.7.2 下运行。

我已将读取代码简化为逐字节读取,以确保问题不是由某些下游库引起的:

response = HTTP.get(url, encoding: Encoding::BINARY)
status = response.status
raise(HTTP::ResponseError, status.to_s) unless status.success?

xml_str_io = StringIO.new
xml_str_io.set_encoding(Encoding::BINARY)

stream_response_body(response.body) do |body|
  while (b = body.read(1))
    xml_str_io.putc(b)
  end
end

为什么(以及在哪里!)转变会发生?为什么只在从 Rails 调用时?ASCII-8BITUTF-8


更新:

我尝试了以下修改,但均无效:

  1. 打包字节数组而不是原始字符串

    body.each do |chunk|
      byteStr = chunk.bytes.pack('C*')
      dst.write(byteStr)
    end
    
  2. 使用代替putcwrite

       body.each do |chunk|
         chunk.bytes.each do |b|
           dst.putc(b)
         end
       end
    

有趣的是,在第二种情况下,我仍然在回溯中看到:write

  Encoding::UndefinedConversionError ("\xE5" from ASCII-8BIT to UTF-8):
    /Users/david/.rvm/gems/ruby-2.7.2/bundler/gems/ucblit-tind-de599ab253cc/lib/ucblit/tind/api/api.rb:108:in `write'
    /Users/david/.rvm/gems/ruby-2.7.2/bundler/gems/ucblit-tind-de599ab253cc/lib/ucblit/tind/api/api.rb:108:in `putc'
    /Users/david/.rvm/gems/ruby-2.7.2/bundler/gems/ucblit-tind-de599ab253cc/lib/ucblit/tind/api/api.rb:108:in `block (3 levels) in copying_thread'

我认为这个失败(可能还有其他失败)在某个地方的 C 代码中?writeIO

Ruby-on-Rails Ruby 编码 IO

评论


答:

0赞 Christian Bruckmayer 2/26/2021 #1

Rails 将默认编码设置为 UTF8

Encoding.default_external = Encoding::UTF_8
Encoding.default_internal = Encoding::UTF_8

https://github.com/rails/rails/blob/291a3d2ef29a3842d1156ada7526f4ee60dd2b59/railties/lib/rails.rb#L22-L23

我相信您需要在编写器管道上设置编码,否则它将使用默认编码。

read_io, write_io = IO.pipe(Encoding::BINARY, Encoding::BINARY, binmode: true)
write_io.set_encoding(Encoding::BINARY)

write_io.write([serialized_object].pack('NA*'), encoding: 'BINARY')