提问人:Tiffany 提问时间:3/14/2023 最后编辑:Tiffany 更新时间:3/14/2023 访问量:17
使用 ParseDelegator 查找输入或其子项中 /wiki/Geographic_coordinate_system 的第一个匹配项
Using ParseDelegator to find the first occurrence of /wiki/Geographic_coordinate_system in input or its children
问:
例如,当输入 Linux 时,它将查找 /wiki/Geographic_coordinate_system 的第一个实例,如果找不到它,它将查看其直接子级来查找它。
我的预期输出是
搜索: Linux - 维基百科,自由的百科全书 检查子项: 发现于: Bell Labs - 维基百科,自由的百科全书
我没有打印第二行,因为我认为它是递归地通过子行而不返回主行。在检查我的孩子之前,我如何循环回 main,以便它通过我的 if 语句?
public static void main(String args[]) throws Exception {
String subject = args[0].replace(" ", "_");
System.out.println("Searching: " + subject + " - Wikipedia");
URL url = new URL("https://en.wikipedia.org/wiki/" + subject);
try {
BufferedReader in = new BufferedReader(new InputStreamReader(url.openStream()));
StringBuilder sb = new StringBuilder();
String input;
while ((input = in.readLine()) != null) {
sb.append(input);
}
in.close();
ParserDelegator parser = new ParserDelegator();
MyParserCallback callback = new MyParserCallback();
parser.parse(new StringReader(sb.toString()), callback, true);
if (!callback.hasFound()) {
System.out.println("Checking children:");
Set<String> visitedLinks = new HashSet<>();
visitedLinks.add("/wiki/" + subject);
for (String href : callback.visitedLinks) {
if (!visitedLinks.contains(href)) {
visitedLinks.add(href);
String childUrl = "https://en.wikipedia.org" + href;
BufferedReader childIn = new BufferedReader(new InputStreamReader(new URL(childUrl).openStream()));
StringBuilder childSb = new StringBuilder();
while ((input = childIn.readLine()) != null) {
childSb.append(input);
}
childIn.close();
ParserDelegator childParser = new ParserDelegator();
MyParserCallback childCallback = new MyParserCallback();
childParser.parse(new StringReader(childSb.toString()), childCallback, true);
if (childCallback.hasFound()) {
return;
}
}
}
答: 暂无答案
评论