提问人:Pubg Mobile 提问时间:11/11/2023 更新时间:11/14/2023 访问量:74
仅保留在正则表达式 [duplicate] 的 HTML 标记之外
Keep only outside of HTML tags by regex [duplicate]
问:
我有一个列表,如下所示:
<td class="News"><a href="ubuntu">Ubuntu</a></td>
<td class="News" style="text-align: right" title="Yesterday: 2578">2571<img src="/web/20061130064026im_/http://distrowatch.com/images/other/adown.png" alt="<" title="Yesterday: 2578"></td>
<td class="News"><a href="suse">openSUSE</a></td>
<td class="News" style="text-align: right" title="Yesterday: 1943">1943<img src="/web/20061130064026im_/http://distrowatch.com/images/other/alevel.png" alt="=" title="Yesterday: 1943"></td>
<td class="News"><a href="fedora">Fedora</a></td>
<td class="News" style="text-align: right" title="Yesterday: 1420">1422<img src="/web/20061130064026im_/http://distrowatch.com/images/other/aup.png" alt=">" title="Yesterday: 1420"></td>
<td class="News"><a href="mepis">MEPIS</a></td>
现在,我只想保留在HTML标签之外或记事本++
中,例如,在上面的列表中,只有以下内容必须保留,其他内容必须删除:>*****<
Ubuntu
2571
openSUSE
1943
Fedora
1422
MEPIS
我尝试遵循正则表达式,但它并不准确,并且还保留了额外的代码:
>([^<>]+)<
我的正则表达式问题在哪里?
答:
评论
alt="<"
alt="<"