来自 tabula-java 的错误:错误:错误:标头不包含 versioninfo

Error from tabula-java: Error: Error: Header doesn't contain versioninfo

提问人:mayk.dyasper 提问时间:3/10/2023 最后编辑:mayk.dyasper 更新时间:3/10/2023 访问量:135

问:

我有一个解析pdf文件的脚本。 在我的 WSL 上,它运行良好,但是当我在 Centos 7 上部署它时,我遇到了这个错误。

我正在使用 tabula-py

python 版本:3.6 Java 版本:11

当我尝试搜索错误时,我什么也没找到。 有人可以帮我吗?

下面是代码示例

from tabula import read_pdf
import tabula

df = read_pdf(pdf_file, pages="1", stream="True")
print(df)
exit()

这是我得到的错误

Error from tabula-java:
Error: Error: Header doesn't contain versioninfo


Traceback (most recent call last):
  File "pdf_data/pdf_invoice.py", line 389, in <module>
    parsePDFInvoice()
  File "pdf_data/pdf_invoice.py", line 203, in parsePDFInvoice
    df = read_pdf(pdf_file, pages="1", stream="True", guess="False")
  File "/usr/local/lib/python3.6/site-packages/tabula/io.py", line 322, in read_pdf
    output = _run(java_options, kwargs, path, encoding)
  File "/usr/local/lib/python3.6/site-packages/tabula/io.py", line 85, in _run
    check=True,
  File "/usr/lib64/python3.6/subprocess.py", line 438, in run
    output=stdout, stderr=stderr)
subprocess.CalledProcessError: Command '['java', '-Dfile.encoding=UTF8', '-jar', '/usr/local/lib/python3.6/site-packages/tabula/tabula-1.0.5-jar-with-dependencies.jar', '--pages', '1', '--stream', '--guess', '--format', 'JSON', '/tmp/22e8f8db-c0c6-4f05-9cd0-6a821d0151a0.pdf']' returned non-zero exit status 1.
python java tabula pdf-解析 tabula-py

评论


答: 暂无答案