Selenium VBA:获取在 <tr 和 <td 下具有 <div 类元素的表内容

Selenium VBA: Get the table content which are having <div class elements under <tr and <td

提问人:TCritical 提问时间:8/4/2023 最后编辑:TCritical 更新时间:8/7/2023 访问量:106

问:

我想从以下 HTML 代码中提取表格内容,这些代码列在 <div 类下。我尝试了不同的代码,但找不到正确的解决方案。

<table class="ISLogIn AvgTotal" id="invoiceAmountAvgTotalNewDesign" width="100%">
    <tbody><tr>
        <td colspan="8">
            <div class="container invoice-lines-container">
                <div class="list invoice-line-total">
                    <div class="col-xs-4 invoice-line-name">Avg Total</div>
                    <div class="col-xs-6 invoice-line-name"></div>
                    <div class="col-xs-2 invoice-line-value">
                        <strong data-bind="autoNumeric: hasDiscrepancyTotalInvoiceWithDb() ? invoiceAmountDbData().SubTotal : subTotal, autoNumericOptions: $root.moneyNegativeFormat()">135.43</strong>
                    </div>
                </div>
            </div>
        </td>
    </tr>
    <tr>
        <td colspan="8">
            <div class="container invoice-lines-container">
                <div class="list invoice-line-header">
                    <div class="invoice-line-name col-xs-4" data-bind="css: { 'col-xs-3': isEditableAdditionalApprovalCode(), 'col-xs-4': isEditableAdditionalApprovalCode() === false }">
                            <span>Sale </span>
                    </div>
                    <div class="invoice-line-name col-xs-3" data-bind="css: { 'col-xs-2': isEditableAdditionalApprovalCode(), 'col-xs-3': isEditableAdditionalApprovalCode() === false }">
                    </div>
                    <div class="invoice-line-value col-xs-4" data-bind="css: { 'col-xs-6': isEditableAdditionalApprovalCode(), 'col-xs-4': isEditableAdditionalApprovalCode() === false }"></div>
                    <div class="col-xs-1 invoice-line-value">Amount</div>
                </div>
                <div class="list invoice-line-item-total" data-bind="visible: isInternational()" style="display: none;">
                    <div class="col-xs-4 invoice-line-name">Total Sale</div>
                    <div class="col-xs-6 invoice-line-name"></div>
                    <div class="col-xs-2 invoice-line-value">
                    </div>
                </div>
                <div class="list invoice-line-item-total" data-bind="visible: !isInternational()">
                    <div class="col-xs-4 invoice-line-name">Total Sale</div>
                    <div class="col-xs-6 invoice-line-name"></div>
                    <div class="col-xs-2 invoice-line-value">
                        <span data-bind="autoNumeric: hasDiscrepancyTotalInvoiceWithDb() ? invoiceAmountDbData().TotalAmountTax : printTaxAmount(), autoNumericOptions: $root.moneyNegativeFormat()">3.14</span>
                    </div>
                </div>
            </div>
        </td>
    </tr>
    <tr>
        <td colspan="8">
            <div class="container invoice-lines-container">
                <div class="list invoice-line-total">
                    <div class="col-xs-4 invoice-line-name">Total (<span data-bind="text: workInfo.currency">INR</span>)</div>
                    <div class="col-xs-6 invoice-line-name"></div>
                    <div class="col-xs-2 invoice-line-value">
                        <span data-bind="autoNumeric: calculatedTotal(), autoNumericOptions: $root.moneyNegativeFormat()">138.57</span>
                    </div>
                </div>
            </div>
        </td>
    </tr>

</tbody></table>

下面是我的 vba selenium 代码,它将所有行信息抓取到一个单元格中。 我试图提取为表格格式。我使用铬和硒。

Option Explicit

Sub table()

Dim d As New ChromeDriver
Dim iM As Object, iMs As Object
Dim S2 As Long, i As Long
    
        d.Get "My URL"
        Application.Wait (Now + TimeValue("00:00:03"))
        With Sh2
            Set iMs = d.FindElementsByCss(".ISLogIn AvgTotal div[class^='list invoice-line']")
            .Activate
            For Each iM In iMs
                S2 = .Cells(Rows.Count, 2).End(xlUp).Row + 1: .Cells(S2, 1).Select
                .Cells(S2, 2) = iM.Text
            Next iM
        End With
        Set iM = Nothing:   Set wOrder = Nothing: Set Sc = Nothing
    
End Sub

以下是我正在尝试的输出

Output required

enter image description here

请用正确的代码指导我。多谢。

html vba selenium-webdriver 网页抓取

评论

0赞 igittr 8/5/2023
两件事:1)你需要用表类来限定你的xPath吗?如果删除该引用,则将从提供的示例中获取数据。但是,标题值和金额值需要以某种方式分开。2) 当对 S2 进行估值时,您正在查看第 2 列但对第 1 列进行估值,因此它将始终返回相同的值。
0赞 TCritical 8/5/2023
@igittr感谢您的回复。.如果需要,请继续删除该引用吗?我正处于手掌状态。我尝试了多次尝试来获得该结果。对于第二点,是的,我正在查看列 2,其中第 1 列有其他数据......
0赞 TCritical 8/5/2023
@igittr,我更改了第二点的代码。它应该是 2 列。.谢谢
0赞 QHarr 8/5/2023
使用表元素 id。还值得注意的是,有一个内置的方法来写出表格:。AsTable.ToExcel,或者如果您已经将 table 声明为 Selenium.Table,那么只需 .ToExcel的。然后,指定要写出的范围
0赞 TCritical 8/6/2023
@igittr,对于延迟回复,我深表歉意,代码运行良好。但是当我为多个 url 运行此代码时,它会弹出一条错误消息。我已包含错误图像以供参考。请您建议我如何克服该错误消息。

答:

1赞 igittr 8/5/2023 #1

这将使您更接近您的解决方案

Option Explicit

Sub table()
    Dim k As Integer
    Dim d As New ChromeDriver
    Dim iM As WebElement, iMs As Object
    Dim sT As String, iV As Selenium.List
    Dim S2 As Long, i As Long
    
    d.Get "your url"
    
    Application.Wait (Now + TimeValue("00:00:03"))
    
    With Sh2
        Set iMs = d.FindElementsByCss("div[class^='list invoice-line']")
        .Activate
        
        For Each iM In iMs
            S2 = .Cells(Rows.Count, 2).End(xlUp).Row + 1
            sT = getTitle(iM)
            Set iV = iM.FindElementsByClass("invoice-line-value")
            
            For k = 1 To iV.Count
                If iV(k).Text <> "" Then
                    .Cells(S2, 3) = iV(k).Text
                    Exit For
                End If
            Next
            
            .Cells(S2, 2) = sT
        Next iM
    End With
    
    Set iM = Nothing: Set wOrder = Nothing: Set Sc = Nothing
    Set iV = Nothing
    d.Quit
End Sub

Function getTitle(iM As WebElement) As String
    Dim iT As WebElement
    
    On Error GoTo notFnd
    
    Set iT = iM.FindElementByClass("col-xs-4")
    getTitle = iT.Text
    
    Exit Function
    
notFnd:
    getTitle = ""
End Function

评论

0赞 igittr 8/7/2023
@TCritical答案已更新为“错误”
0赞 TCritical 8/7/2023
谢谢。你帮了我很多..