在 python 中将 PDF 转换为 CSV-解网

问：

我有一个 36 页的 pdf，我想将其转换为 csv。该文件具有行名称和属性列。我正在使用 tabula，这是我的代码，可以将第一页的 pdf 转换为 csv。但是当我尝试包含更多页面时，它会变得凌乱并且无法正确执行。那么，我该如何重新调整我的代码，将我的 pdf 的所有 36 页都变成 csv？

import tabula 

file = "data.pdf"

pdfData = tabula.read_pdf(file, pages =1)

output = tabula.convert_into(file, "converted.csv", output_format="csv", pages =1 )

print(output)

python-3.x csv pdf 表格

如果有帮助，请尝试类似的事情：

import tabula 
file = "data.pdf"
all_data = []
for page in range(1, 37):
    pdf_data = tabula.read_pdf(file, pages=page)
    all_data.append(pdf_data)
final_data = pd.concat(all_data, ignore_index=True)
final_data.to_csv("converted.csv", index=False)

1赞 Goku - stands with Palestine 10/12/2023 #2

你可以使用pages ='all'

import tabula 

file = "data.pdf"

output = tabula.convert_into(file, "converted.csv", output_format="csv", lattice=True, stream=False,  pages="all" )

文档链接：

https://tabula-py.readthedocs.io/en/latest/tabula.html

https://pypi.org/project/tabula-py/

上一个：使用 tabula 读取 PDF 文件

下一个：PDF 转 CSV 以使用 Tabula

在 python 中将 PDF 转换为 CSV

Converting PDF to CSV in python

评论

如果有帮助，请尝试类似的事情：