beautifulsoup：find_all bs4.element.ResultSet 对象或列表？-解网

问：

我应用一个对象，并找到一些东西，这是一个对象或一个.find_allbeautifulsoupbs4.element.ResultSetlist

我想在那里进一步做，但它不允许在对象上。我可以遍历对象的每个元素来做。但是我可以避免循环并将其转换回对象吗？find_allbs4.element.ResultSetbs4.element.ResultSetfind_allbeautifulsoup

这是我的代码：

html_1 = """
<table>
    <thead>
        <tr class="myClass">
            <th>A</th>
            <th>B</th>
            <th>C</th>
            <th>D</th>
        </tr>
    </thead>
</table>
"""
soup = BeautifulSoup(html_1, 'html.parser')

type(soup) #bs4.BeautifulSoup

# do find_all on beautifulsoup object
th_all = soup.find_all('th')

# the result is of type bs4.element.ResultSet or similarly list
type(th_all) #bs4.element.ResultSet
type(th_all[0:1]) #list

# now I want to further do find_all
th_all.find_all(text='A') #not work

# can I avoid this need of loop?
for th in th_all:
    th.find_all(text='A') #works

python html beautifulsoup html 解析

通常，CSS 选择器可以帮助您一次性解决它，但并非所有您可以使用该方法完成所有操作。例如，CSS选择器中没有可用的“文本”搜索。但是，例如，如果你必须找到所有元素，比如元素内部的元素，你可以这样做：find_all()select()bs4bth

soup.select("th td")

html_1 = """
<table>
    <thead>
        <tr class="myClass">
            <th>A</th>
            <th>B</th>
            <th>C</th>
            <th>D</th>
        </tr>
    </thead>
</table>
"""
soup = BeautifulSoup(html_1, 'html.parser')

th_all = soup.find_all('th', string='A')  # [<th>A</th>]

texts = [th.string for th in th_all]      # ['A']

回答问题的第二部分：

我们如何将 ResultSet 转换为 BeautifulSoup 对象？

我们可以明确地将其转换为一个。然后我们可以调用它。find_all()

th_all = soup.find_all('th')
soup2 = BeautifulSoup('\n'.join(map(str, th_all)))
soup2.find_all(string='A')   # ['A']

但是，由于我们已经可以对 ResultSet 进行搜索，因此在这种情况下可能并不可取。

上一个：如何使用 Node.js 解析 HTML 页面

下一个：beautifulsoup：find_all bs4.element.ResultSet 对象或列表？

beautifulsoup：find_all bs4.element.ResultSet 对象或列表？

beautifulsoup: find_all on bs4.element.ResultSet object or list?

评论

评论