提问人:user22062084 提问时间:6/13/2023 更新时间:6/13/2023 访问量:35
美丽的汤只得到桌子的标题
Beautiful Soup only gets header of table
问:
我正在尝试将数据从本网站上的表格导入到 csv:http://www.ameren.com/illinois/residential/supply-choice/renewables/interconnection-queue。
我尝试了许多不同的解决方案,正如用户过去在这个论坛上发布的那样,但它们似乎都不起作用。我相信这是html本身的问题,但我想知道是否有办法解决这个问题。
我也在不同的网站上尝试了相同的代码,并且成功运行。
这是我目前拥有的代码:
from bs4 import BeautifulSoup
import requests
from pprint import pprint
page = requests.get('http://www.ameren.com/illinois/residential/supply-choice/renewables/interconnection-queue')
soup = BeautifulSoup(page.content,'html5lib')
table = soup.find("table")
comments = [[td.get_text() for td in row.find_all('td')] for row in table.find_all('tr')]
pprint(comments)
输出如下:[[]]
答:
0赞
MendelG
6/13/2023
#1
如果检查浏览器的“网络调用”(在浏览器中按 F12 -> “网络”选项卡),则会看到该页面发出 POST 请求,以便:
https://www.ameren.com/api/ameren/InterConnectReport/InterConnectReportSearchResults
您可以模拟请求并以可以访问键/值的格式获取所有数据。其实你不需要,就够了:dict
BeautifulSoup
requests
import requests
data = {
"draw": "1",
"columns[0][data]": "projectId",
"columns[0][name]": "",
"columns[0][searchable]": "true",
"columns[0][orderable]": "true",
"columns[0][search][value]": "",
"columns[0][search][regex]": "false",
"columns[1][data]": "propertyType",
"columns[1][name]": "",
"columns[1][searchable]": "true",
"columns[1][orderable]": "false",
"columns[1][search][value]": "",
"columns[1][search][regex]": "false",
"columns[2][data]": "totalGeneratorRating",
"columns[2][name]": "",
"columns[2][searchable]": "true",
"columns[2][orderable]": "false",
"columns[2][search][value]": "",
"columns[2][search][regex]": "false",
"columns[3][data]": "currentStatus",
"columns[3][name]": "",
"columns[3][searchable]": "true",
"columns[3][orderable]": "false",
"columns[3][search][value]": "",
"columns[3][search][regex]": "false",
"columns[4][data]": "substationId",
"columns[4][name]": "",
"columns[4][searchable]": "true",
"columns[4][orderable]": "true",
"columns[4][search][value]": "",
"columns[4][search][regex]": "false",
"columns[5][data]": "subQueue",
"columns[5][name]": "",
"columns[5][searchable]": "true",
"columns[5][orderable]": "true",
"columns[5][search][value]": "",
"columns[5][search][regex]": "false",
"columns[6][data]": "feederId",
"columns[6][name]": "",
"columns[6][searchable]": "true",
"columns[6][orderable]": "true",
"columns[6][search][value]": "",
"columns[6][search][regex]": "false",
"columns[7][data]": "feederQueue",
"columns[7][name]": "",
"columns[7][searchable]": "true",
"columns[7][orderable]": "true",
"columns[7][search][value]": "",
"columns[7][search][regex]": "false",
"order[0][column]": "0",
"order[0][dir]": "asc",
"start": "0",
"length": "25",
"search[value]": "",
"search[regex]": "false",
"ProjectId": "",
"PropertyType": "",
"Substation": "",
"Feeder": "",
}
response = requests.post(
"https://www.ameren.com/api/ameren/InterConnectReport/InterConnectReportSearchResults",
data=data,
).json()
fmt_string = "{:<20} {:<20} {:<20} {:<20} {:<20} {:<20} {:<20} {:<20}"
columns = [x["data"] for x in response["columns"]]
print(fmt_string.format(*columns))
for data in response["data"]:
print(
fmt_string.format(
data["projectId"],
data["propertyType"],
data["totalGeneratorRating"],
data["currentStatus"],
data["substationId"],
data["subQueue"],
data["feederId"],
data["feederQueue"],
)
)
指纹:
projectId propertyType totalGeneratorRating currentStatus substationId subQueue feederId feederQueue
DER-04311 Level 2 0.2 Construction 146195 1 A27011 1
DER-05094 Level 2 1.14 Construction 975902 2 M07235 1
DER-06422 Level 2 2.0 Construction 322219 2 Q10702 1
DER-06608 Level 2 2.0 Construction 565083 3 B19001 1
DER-06609 Level 2 2.0 Construction 458366 2 C00001 1
DER-06709 Level 2 1.25 Witness Test 520122 1 B68003 1
DER-06784 Level 2 0.5 Witness Test 771055 2 Q85162 1
DER-06786 Level 2 0.5 Witness Test 890156 2 L80221 1
DER-06837 Level 2 2.0 Construction 674509 3 N60172 1
DER-06866 Level 4 5.0 Construction 765864 1 C76002 1
DER-06868 Level 4 5.0 Construction 505417 2 J39392 1
DER-06876 Level 4 5.0 Construction 911349 1 Y89540 1
DER-06986 Level 2 1.95 Construction 469483 1 M05368 1
DER-06998 Level 2 2.0 Construction 505417 3 J58380 1
评论