提问人:ffi23 提问时间:3/28/2023 更新时间:3/29/2023 访问量:489
使用 smart_open 从 http 下载.gz流并上传到 s3 存储桶
Use smart_open to download a .gz stream from http and upload to s3 bucket
问:
我想从 http 流式传输下载一个 .txt.gz 文件并将流式传输到 s3 存储桶,我已经做到了这一点,但它不起作用,我错过了什么?
from smart_open import open as sopen
chunk_size = (16 * 1024 * 1024)
http_url = 'http://someurl'
with sopen(http_url, 'rb', transport_params={'headers' : {'Subscription-Key': 'somekey'}}) as fin:
with sopen('s3://bucket/filename.txt.gz', 'wb') as fout:
while True:
buf = fin.read(chunk_size)
if not buf:
break
fout.write(chunk_size)
答:
1赞
ffi23
3/29/2023
#1
事实证明,我制作它可能要简单得多。
虽然我不确定引擎盖下的smart_open是否正在解压缩和重新压缩文件?
from smart_open import open as sopen
http_url = 'http://someurl'
with sopen(http_url, 'rb', transport_params={'headers' : {'Subscription-Key': 'somekey'}}) as fin:
with sopen('s3://bucket/filename.txt.gz', 'wb') as fout:
for line in fin:
fout.write(line)
评论