JSoup 解析网页读取 Table [duplicate] 的 thead 和 tbody-解网

问：

这个问题在这里已经有答案了：

页面内容加载了 JavaScript，而 Jsoup 看不到它（8 个答案）

Jsoup Java HTML 解析器：执行 Javascript 事件（2 个答案）

11个月前关闭。

需要：解析网页并读取表格中显示的详细信息（要解析的网页 - 链接)

问题：我无法在 tbody 部分下获取详细信息。目前，我只能获得 thead 详细信息。

研究：我检查了堆栈溢出的此链接，但无法弄清楚我的情况。

我需要提取的 HTML 表格

<table id="industryInfo" class="eq-series tbl-securityinfo cap-hide">
           <caption></caption>
           <thead>
                    <tr>
                          <th>Macro-Economic Sector</th>
                          <th>Sector</th>
                          <th>Industry</th>
                          <th>Basic Industry</th>
                    </tr>
           </thead>
           <tbody class="">
                    <tr>
                         <td>Commodities</td>
                         <td>Metals & Mining</td>
                         <td>Ferrous Metals</td>
                         <td>Pig Iron</td>
                    </tr>
           </tbody>
</table>

法典：

    String url = "https://www.nseindia.com/get-quotes/equity?symbol=ADANIENT";
    Document document = new Document(url);
        try {
               document = Jsoup.connect(url).userAgent("Mozilla/5.0").get();
        } catch (IOException e) {
            e.printStackTrace();
        }
//        System.out.println(document);
        Elements elements = document.select("#industryInfo");
        for (Element element : elements) {
            System.out.println(element);
        }

希望面临的问题是清楚的，任何关于我缺少什么的指示都会有所帮助

Java jsoup html 解析

看起来该页面的内容是在加载页面后由某些脚本（通常是 JavaScript）动态添加的。Jsoup 不是浏览器模拟器，不支持执行 JavaScript 代码。要么将工具更改为支持 JavaScript 的工具，如 Selenium webdriver，要么使用您的浏览器开发人员工具来观察该页面为加载该信息而发出的请求。当你知道它时，你可以尝试从同一个地方阅读。

1赞 Pshemo 12/29/2022

从我所看到的情况来看，返回带有该表的数据（也）的 JSON。解析它并搜索您想要的内容。https://www.nseindia.com/api/quote-equity?symbol=ADANIENT

答： 暂无答案

上一个：Jsoup 和 HttpClient 无法看到页面内容

下一个：我需要在 itext 7 中使用 HTML 标记现有的 pdf，这怎么可能？显然来自版本 5 但不兼容的内容

JSoup 解析网页读取 Table [duplicate] 的 thead 和 tbody

JSoup parse web page to read thead and tbody of Table [duplicate]

评论