替换 HTML 页面文本中找到的 url 文件路径的一部分-解网

问：

给定html页面中的一些网址，我想替换一些网址，如下所示：

示例 url：从那个开始，我想用另一个文本（https://example.com/cost-center/sub-one/article1/cost-center/article1test)

这意味着上面的 url 将转换为：。https://example.com/cost-center/test/article1

就我而言，之后可以有更多部分，url 可以以斜杠结尾，也可以在引号内，如下例所示：/cost-center/

https://example.com/cost-center/sub-one/sub-two/article-3/
https://example.com/cost-center/sub-one/sub-three/article-4
https://example.com/cost-center/sub-1/sub-two/sub-three/article-5/
'https://example.com/cost-center/sub-one/sub-two/article-3/'
'https://example.com/cost-center/sub-1/sub-two/sub-three/article-5'
"https://example.com/cost-center/sub-one/sub-three/article-4"
"https://example.com/cost-center/sub-1/sub-two/sub-three/article-5/"

这些将被替换如下：

https://example.com/cost-center/test/article-3/
https://example.com/cost-center/test/article-4
https://example.com/cost-center/test/article-5/
'https://example.com/cost-center/test/article-3/'
'https://example.com/cost-center/test/article-5'
"https://example.com/cost-center/test/article-4"
"https://example.com/cost-center/test/article-5/"

现在，我们假设 url 在 /cost-center/;

例如https://example.com/cost-center/sub-1/sub-two/sub-three/article-5/

所以基本上我想在保留最后一部分的同时替换它的某些部分。

我尝试使用数字正则表达式，例如：

preg_replace('~https://example.com/cost-center/[^/]+/([^/]+)~', 'https://example.com/cost-center/test/$1', $url);

preg_replace('/(["\']?)(https:\/\/[^\/]+\/)([^\/]+)(\/[^"\s]*)?/', '$1$2test$4$1', $url);

我也尝试过使用拆分 url 并逐个手动解析它，但结果非常复杂和丑陋。explode

也没有好的结果。ChatGPT

php 正则表达式 url 路径 preg-replace

我尝试了带有的版本，它适用于以斜杠结尾的 url，但如果 url 以引号而不是斜杠结尾，它仍然不匹配。另一个例子：这将匹配所有，但它应该匹配到 url 的末尾。有没有办法处理这两种情况？href'href="https://example.com/cost-center/sub-one/sub-two/article-3" data-id="5"'article-3" data-id="5"'

0赞 mickmackusa 9/9/2023 #2

从您对任务的描述和示例数据来看，URL 是否/如何用引号换行实际上并不重要。您只需要匹配 URL 的前导部分以验证它是否是 URL，然后隔离不需要的子字符串并替换它。

请注意，我的替换值只是字符串，没有对捕获组的引用。这是因为会忘记/释放到该点为止匹配的所有字符，并且是一种预告，这意味着它不会消耗任何匹配的字符。test\K(?= ... )

至于隔离要替换的模式部分，我使用一个包含正斜杠和空格的否定字符类，然后是一个字面上的正斜杠。该子模式可能会多次重复一个矿石（因为量词）。+

代码：（演示)

echo preg_replace('#https://[^/]+/cost-center/\K([^/\s]+/)+(?=article)#', 'test/', $text);

上一个：使用 Bash RegEx 和 Grep 时遇到问题 – 需要帮助

下一个：在 PHP 中将 html 标题转换为列表元素

替换 HTML 页面文本中找到的 url 文件路径的一部分

Replace portion of url filepath found in HTML page text

评论

评论