提问人:defalt1996 提问时间:4/23/2023 更新时间:4/30/2023 访问量:98
如何使用htmlunit记录渲染页面期间触发的所有请求?
How to use htmlunit to record all requests fired during rendering a page?
问:
我正在使用 HTMLUnit 尝试记录加载本地 html 文件时触发的所有请求。 这是下面的测试文件:
<script type="text/javascript">
!(function () {
var adc = function (str) {
return decodeURIComponent(escape(window.atob(str)));
};
document.write(
adc(
"PGEgaHJlZj0iaHR0cHM6Ly9kc3A4dTRqaGE0NHJyLmNsb3VkZnJvbnQubmV0L2RpcmVjdC8xOTgyMzQxMDE/YWR4PUFsZ29yaXgoUHJvKSZhcHA9MjgxOTQwMjkyJnByaWNlPTAuOTEwMSZyZD1MVldrTUJRTUxGQzhrIiB0YXJnZXQ9Il9ibGFuayI+PGltZyBzcmM9Imh0dHBzOi8vZHNwOHU0amhhNDRyci5jbG91ZGZyb250Lm5ldC9pbXAvMTk4MjM0MTAxP2FkeD1BbGdvcml4KFBybykmYXBwPTI4MTk0MDI5MiZwcmljZT0wLjkxMDEmcmQ9TFZXa01CUU1MRkM4ayIgd2lkdGg9IjMyMCIgaGVpZ2h0PSI1MCI+PC9hPjxpbWcgc3JjPSJodHRwczovL2QybWsybmg4dmZmNzY4LmNsb3VkZnJvbnQubmV0L3YxL3BpeGVsP2E9MTAyOSZiPTEwNDYmYz0xJmQ9ZmZkNDkyNTFjYjkwOGI5NSZlPTU4Njg0YWYzN2Q0MTFhNGImZj0wLjkxMDEmZz0wLjkxMDE0Jmg9MTA0NSZpPWZiMTJmZjY5ZjJkZmFiZjAmaz04MzM1NTEzMTAxMDI5ODI2NjAmcmQ9TFZXa01CUU1MRkM4ayIgYm9yZGVyPSIwIiB3aWR0aD0iMSIgaGVpZ2h0PSIxIi8+"
)
);
document.write(
adc(
"PGltZyBzcmM9Imh0dHBzOi8vdXNlLnRyay5zdnItYWxnb3JpeC5jb20vaW1wP2NycHY9MyZpbmZvPTlFbVpwWkNNdWdETnVFak14NHlNeTBEY3BWbkp5a2pNd1FUT3hnak05UVdkaVpDTTlRSGR5Tm5KeDBEZDBsbVltRVRQdEJuWW1Bek45STNjeU5uSngwVGJtQm5KdzBEYzRWbUp3MFRhd0ZtSngwVGUwRm1KelFUTndZVFBrbDJjbWdUTTVJak41RURPMkVUUDBKbkp4a2pMdzBUYmhaU000TWpOdUFUUHRaU013RVRPdUFUUHRKbUp3a3pNOWtuWW1FMFVWMXpZbUV6TTJRek54MERjbVEyTmhKV04xUWpNelUyTTNnell3Z1RZalZHTndVR09rSlRZNVVUTW1oVFo5RW5jJnByaWNlPSR7QVVDVElPTl9QUklDRX0mcz02MDU0MyZyPWU4ZjE1OWEyZDhlMDRlY2E4MGM4NzNlMzI0NTViYTdkIiB3aWR0aD0iMSIgaGVpZ2h0PSIxIiBzdHlsZT0iZGlzcGxheTpub25lOyI+PGRpdiBpZD0iZG9qczIwMTJiMDVhIiBkYXRhLXdpZHRoPSIzMjAiIGRhdGEtaGVpZ2h0PSI1MCIgZGF0YS10cms9J2h0dHBzOi8vdXNlLnRyay5zdnItYWxnb3JpeC5jb20vaW1wP2NycHY9MyZpbmZvPTlFbVpwWkNNdWdETnVFak14NHlNeTBEY3BWbkp5a2pNd1FUT3hnak05UVdkaVpDTTlRSGR5Tm5KeDBEZDBsbVltRVRQdEJuWW1Bek45STNjeU5uSngwVGJtQm5KdzBEYzRWbUp3MFRhd0ZtSngwVGUwRm1KelFUTndZVFBrbDJjbWdUTTVJak41RURPMkVUUDBKbkp4a2pMdzBUYmhaU000TWpOdUFUUHRaU013RVRPdUFUUHRKbUp3a3pNOWtuWW1FMFVWMXpZbUV6TTJRek54MERjbVEyTmhKV04xUWpNelUyTTNnell3Z1RZalZHTndVR09rSlRZNVVUTW1oVFo5RW5jJnByaWNlPSR7QVVDVElPTl9QUklDRX0mcz02MDU0MyZyPWU4ZjE1OWEyZDhlMDRlY2E4MGM4NzNlMzI0NTViYTdkJyBkYXRhLWlkPSdBbGdvcmlYLWU4ZjE1OWEyZDhlMDRlY2E4MGM4NzNlMzI0NTViYTdkJz48c2NyaXB0IHR5cGU9J3RleHQvamF2YXNjcmlwdCcgYXN5bmMgc3JjPSJodHRwczovL3Ryay5zdnItYWxnb3JpeC5jb20vc3RhdGljL200LmpzP3Q9OTM0NDIzIj48L3NjcmlwdD48L2Rpdj4="
).replace(new RegExp(adc("XCR7QVVDVElPTl9QUklDRX0="), "g"), "0.6381")
);
})();
</script>
<img
src="https://use.trk.svr-algorix.com/win?crpv=3&info=9EmZpZCMugDNuEjMx4yMy0DcpVnJykjMwQTOxgjM9QWdiZCM9QHdyNnJx0Dd0lmYmETPtBnYmAzN9I3cyNnJx0TbmBnJw0Dc4VmJw0TawFmJx0Te0FmJzQTNwYTPkl2cmgTM5IjN5EDO2ETP0JnJxkjLw0TbhZSM4MjNuATPtZSMwETOuATPtJmJwkzM9knYmE0UV1zYmEzM2QzNx0DcmQ2NhJWN1QjMzU2M3gzYwgTYjVGNwUGOkJTY5UTMmhTZ9Enc&price=0.6381&s=60543&r=e8f159a2d8e04eca80c873e32455ba7d"
width="1"
height="1"
style="display: none"
/>
在 Chrome 中呈现它时,显示 url 跟踪列表的“网络”选项卡:
包括本地文件本身在内,共触发了 7 个请求。这是我期望在我的代码打印结果中看到的。
我的代码如下:
public class RenderHTML extends WebConnectionWrapper {
static List<String> list = new ArrayList<String>();
public RenderHTML(WebClient webClient) throws IllegalArgumentException {
super(webClient);
}
@Override
public WebResponse getResponse(WebRequest request) throws IOException {
// Log the URL of the request
System.out.println(request.getUrl().toString());
return super.getResponse(request);
}
public static void main(String[] args) throws IOException {
try (WebClient webClient = new WebClient(BrowserVersion.CHROME)) {
// Wrap the client with the URLRecorder
webClient.getOptions().setJavaScriptEnabled(true);
webClient.waitForBackgroundJavaScriptStartingBefore(100_000);
webClient.waitForBackgroundJavaScript(100_000);
webClient.getOptions().setCssEnabled(true);
webClient.getOptions().setRedirectEnabled(true);
webClient.getOptions().setUseInsecureSSL(false);
webClient.getOptions().setThrowExceptionOnFailingStatusCode(false);
webClient.getOptions().setThrowExceptionOnScriptError(false);
webClient.getCookieManager().setCookiesEnabled(true);
webClient.setAjaxController(new AjaxController());
webClient.getCookieManager().setCookiesEnabled(true);
webClient.setWebConnection(new RenderHTML( webClient));
// Load the local HTML file
HtmlPage page = webClient.getPage("file:///Users/derrickguo/work/project/project_java/analyze_demand_tool_maven/src/main/lib/algorix_us_adm.html");
}
}
}
一个触发的请求,然后处理完成。
任何人都可以帮我如何获取所有已触发的请求?谢谢!
答:
0赞
RBRi
4/30/2023
#1
HtmlUnit 是一个无头浏览器 - 默认情况下不会下载图像。 但你可以打开它
webClient.getOptions().setDownloadImages(true);
对 3.1.0 版进行了一些测试,我能够看到所有请求。
请记住,方法 waitForBackgroundJavaScriptStartingBefore() 和 waitForBackgroundJavaScript() 不是选项。您必须在获取页面或单击后致电他们(但在您的情况下不需要这样做)。
我的测试代码:
public class Issue76084456 extends WebConnectionWrapper {
public Issue76084456(WebClient webClient) throws IllegalArgumentException {
super(webClient);
}
@Override
public WebResponse getResponse(WebRequest request) throws IOException {
// Log the URL of the request
System.out.println("#######" + request.getUrl().toString());
return super.getResponse(request);
}
public static void main(String[] args) throws IOException {
try (WebClient webClient = new WebClient(BrowserVersion.FIREFOX)) {
// Wrap the client with the URLRecorder
webClient.setWebConnection(new Issue76084456(webClient));
webClient.getOptions().setDownloadImages(true);
// Load the local HTML file
HtmlPage page = webClient.getPage("file:///C:/RBRi/htmlunit/algorix_us_adm.html");
}
}
}
评论
type="text/javascript