用 python 一次替换多个模式

Replace multiple patterns once at a time with python

提问人:FozenOption 提问时间:10/1/2022 更新时间:10/2/2022 访问量:109

问:

所以我想做的基本上是我有一个具有多个参数的 URL 列表,例如:

https://www.somesite.com/path/path2/path3?param1=value1&param2=value2

我想得到的是这样的:

https://www.somesite.com/path/path2/path3?param1=PAYLOAD&param2=value2
https://www.somesite.com/path/path2/path3?param1=value1&param2=PAYLOAD

就像我想遍历每个参数(基本上是“=”和“&”的每个匹配项)并每次替换一个值。先谢谢你。

Python 循环 替换 拆分 匹配

评论

2赞 Barmar 10/1/2022
首先解析 URL 并获取参数列表。然后,您可以遍历列表,每次替换一个值,然后将它们组合回一个完整的 URL。urllib

答:

0赞 Eftal Gezer 10/1/2022 #1
from urllib.parse import urlparse
import re

urls = ["https://www.somesite.com/path/path2/path3?param1=value1&param2=value2&param3=value3",
        "https://www.anothersite.com/path/path2/path3?param1=value1&param2=value2&param3=value3"]
parseds = [urlparse(url) for url in urls]
newurls = []
for parsed in parseds:
    params = parsed[4].split("&")
    for i, param in enumerate(params):
        newparam = re.sub("=.+", "=PAYLOAD", param)
        newurls.append(
            parsed[0] +
            "://" +
            parsed[1] +
            parsed[2] +
            "?" +
            parsed[4].replace(param, newparam)
            )

newurls

['https://www.somesite.com/path/path2/path3?param1=PAYLOAD&param2=value2&param3=value3',
 'https://www.somesite.com/path/path2/path3?param1=value1&param2=PAYLOAD&param3=value3',
 'https://www.somesite.com/path/path2/path3?param1=value1&param2=value2&param3=PAYLOAD',
 'https://www.anothersite.com/path/path2/path3?param1=PAYLOAD&param2=value2&param3=value3',
 'https://www.anothersite.com/path/path2/path3?param1=value1&param2=PAYLOAD&param3=value3',
 'https://www.anothersite.com/path/path2/path3?param1=value1&param2=value2&param3=PAYLOAD']

评论

0赞 Eftal Gezer 10/1/2022
@FozenOption 如果顺序很重要,我们可以通过正则表达式对参数进行排序。
0赞 FozenOption 10/1/2022
这不知何故一次只接受两个参数,比如如果有 param3,它会给我 param1 和 param2 或 param1 和 param3 或 param2 和 param3,它省略了第 3 个参数,您建议进行哪些更改
0赞 FozenOption 10/2/2022 #2

我已经解决了:

from urllib.parse import urlparse

url = "https://github.com/search?p=2&q=user&type=Code&name=djalel"

parsed = urlparse(url)
query = parsed.query
params = query.split("&")
new_query = []
for param in params:
    l = params.index(param)
    param = str(param.split("=")[0]) + "=" + "PAYLOAD"
    params[l] = param
    new_query.append("&".join(params))
    params = query.split("&")

for query in new_query:
        print(str(parsed.scheme) + '://' + str(parsed.netloc) + str(parsed.path) + '?' + query)

输出:

https://github.com/search?p=PAYLOAD&q=user&type=Code&name=djalel
https://github.com/search?p=2&q=PAYLOAD&type=Code&name=djalel
https://github.com/search?p=2&q=user&type=PAYLOAD&name=djalel
https://github.com/search?p=2&q=user&type=Code&name=PAYLOAD