提问人:Ashar 提问时间:9/6/2023 最后编辑:VLAZAshar 更新时间:9/6/2023 访问量:53
使用 PowerShell 在网页内容中查找 URL
find url in web page content using powershell
问:
我需要使用 powershell 从 https://www.windwardstudios.com/version/version-downloads 中搜索 https://cdn.windwardstudios.com/Archive/23.X/23.3.0/JavaRESTfulEngine-23.3.0.32.zip 网址。
因此,我需要https:\\<anything>\JavaRESTfulEngine<anything>.zip
首先,我尝试了哪种方法并给了我所需的 URL$regexPattern = 'https://cdn\.windwardstudios\.com/Archive/\d{2}\.X/\d+\.\d+\.\d+/JavaRESTfulEngine-.*?\.zip'
为了进一步概括,我尝试过,但现在它不起作用。$regexPattern = 'https://cdn\.windwardstudios\.com/Archive/([^/]+)/JavaRESTfulEngine-.*?\.zip'
下面是我的powershell脚本。
# URL of the website to scrape
$websiteUrl = https://www.windwardstudios.com/version/version-downloads
# Use Invoke-WebRequest to fetch the web page content
$response = Invoke-WebRequest -Uri $websiteUrl
# Check if the request was successful
if ($response.StatusCode -eq 200) {
# Parse the HTML content to find the zip file URL using a regular expression
$htmlContent = $response.Content
$regexPattern = 'https://cdn\.windwardstudios\.com/Archive/([^/]+)/JavaRESTfulEngine-.*?\.zip'
$zipFileUrls = [regex]::Matches($htmlContent, $regexPattern) | ForEach-Object { $_.Value }
if ($zipFileUrls.Count -gt 0) {
Write-Host "Found zip file URLs:"
$zipFileUrls | ForEach-Object { Write-Host $_ }
} else {
Write-Host "Zip file URLs not found on the page."
}
} else {
Write-Host "Failed to fetch the web page. Status code: $($response.StatusCode)"
}
输出:
Zip file URLs not found on the page.
期望输出:
https://cdn.windwardstudios.com/Archive/23.X/23.3.0/JavaRESTfulEngine-23.3.0.32.zip
你能提出建议吗?
答:
1赞
Wiktor Stribiżew
9/6/2023
#1
你可以使用
https://cdn\.windwardstudios\.com/Archive/(\S+?)/JavaRESTfulEngine-.*?\.zip
请参阅正则表达式演示。
细节:
https://cdn\.windwardstudios\.com/Archive/
- 文字字符串https://cdn.windwardstudios.com/Archive/
(\S+?)
- 第 1 组:一个或多个非空格字符,尽可能少/JavaRESTfulEngine-
- 文字字符串/JavaRESTfulEngine-
.*?
- 除换行符外的任何零个或多个字符尽可能少\.zip
- 一个字符串。.zip
评论
/
[^/]*
https://cdn\.windwardstudios\.com/Archive/(\S+?)/JavaRESTfulEngine-.*?\.zip
cdn\.windwardstudios\.com/Archive
https://(\S+?)/JavaRESTfulEngine-.*?\.zip