提问人:Muldoon 提问时间:11/4/2023 最后编辑:litMuldoon 更新时间:11/4/2023 访问量:28
PowerShell - 在下划线之前和之后修剪文件名适用于小文件集
PowerShell - Trim File Names Before and After Underscores Works For Small Sets of Files
问:
这是我第一次尝试使用 PowerShell,但我有一个文件夹,其中包含数百个文件,如下所示。
- 时间戳#1_File_Name_of_the_Item_a650-d322205a2071.xls
- 时间戳#1_File_Name_of_the_Item_a650-d3b580442072.txt
- 时间戳#1_File_Name_of_the_Item_a650-d3bhnf5a2073.xlsx
- 时间戳#1_File_Name_of_the_Item_a650-d3b523da2074.csv
- 时间戳#2_File_Name_of_the_Item_a650-d3bbvx5a2075.xls
- File_Name_of_the_Item_a650-D3DDFE5A2075.pdf
我的最终目标是:
修剪之前的所有内容,包括第一个下划线
a. 除非它不是以日期时间戳开头
修剪后面的所有内容,包括最后一个下划线,但保留扩展名
将剩余的下划线替换为空格
了解为什么我不能将现有代码用于以下文件
20231025041306_LLLLL_Aaaaaaaaaa_7777d4cb-6666631F-FA38-473E-A650-D3564505A2075.xls 20231025041406_LLLLL_Aaaaaaaaaa_8777befd-87765C3E-3164-4800-B102-A82D48AAAA52.xlsx 20231025041436_LLLLL_Aaaaaaaaaa_73d2bbbc.PDF 20231025041518_LLLLL_Aaaaaaaaaa_210zzz2c.csv 20231025041613_LLLLL_Aaaaaaaaaa_aqqqq1ad.txt 20231025041906_cccc_dddddd_rrrrrr_a6fff0d3.xls 20231025041935_cccc_dddddd_rrrrrr_f37ggg89.pdf 20231025042000_cccc_dddddd_rrrrrr_9e812343.csv 20231025042026_cccc_dddddd_rrrrrr_d7522280.txt 20231025042229_LllllAaaaaaaa_OO_OoooTttt_37gggd7-5e81ffhgedc77-4c8e-9fbc-d2996ggg0df1.xls 20231025042254_LllllAaaaaaaa_OO_OoooTttt_4fjjjfrgb-e3ec7993-92d7-4ab8-ad9e-83ejjjjjj929b.xlsx 20231025042329_LllllAaaaaaaa_OO_OoooTttt_c0fkkkkf2.pdf 20231025042410_LllllAaaaaaaa_OO_OoooTttt_b555tefd7f.csv 20231025042505_LllllAaaaaaaa_OO_OoooTttt_9784g07e.txt 20231025042747_Ppppp_Rrrrr_Rrrrrr_2902e487-cc3c6chhhh074-4a2e-a97f-bfa0000a062e.xls 20231025042813_Ppppp_Rrrrr_Rrrrrr_aab84122-2fzzzz68-a706-49a5-a3ef-40030ffff0a3.xlsx 20231025042842_Ppppp_Rrrrr_Rrrrrr_79cdgggd2.PDF 20231025042923_Ppppp_Rrrrr_Rrrrrr_f07yyya8f.csv 20231025043220_Tttt_Dddddd_Rrrrrr_2444gr18d-13b4fb14-8fc2-45e0-b18b-59jkh6353d78.xlsx
出于某种原因,如果我尝试使用所有文件编写代码,那么我会得到左侧的结果。但是,如果我删除任何一个文件,那么我会在右侧获得正确的结果。
一气呵成 | 删除了一个文件 |
---|---|
哎呀呀呀.xls呀 | 哎呀呀呀.xls呀 |
哎呀呀呀.xlsx呀 | 哎呀呀呀.xlsx呀 |
哎呀呀呀.pdf呀 | 哎呀呀呀.pdf呀 |
哎呀呀呀.csv呀 | 哎呀呀呀.csv呀 |
哎呀呀.txt呀 | 哎呀呀.txt呀 |
dddddd.xls | CCCC dddddd rrrrrr.xls |
哒.pdf | CCCC dddddd rrrrrr.pdf |
dddddd.csv | 中国交建 dddddd rrrrrr.csv |
哒.txt | CCCC dddddd rrrrrr.txt |
OO.csv | LllllAaaaaaaa OO OoooTttt.csv |
OO.pdf | 哎呀呀.pdf呀 |
OO.txt | LllllAaaaaaaa OO OoooTttt.txt |
OO.xls | 哎呀呀.xls呀 |
OO的.xlsx | 哎呀呀.xlsx呀 |
呜.csv | pp rr rr.csv |
呜.pdf | 啪.pdf啪 |
rrrr.xls | pp rr rr.xls |
rrrr.xlsx | 啪.xlsx啪 |
ttt ddddd rrrrrr.xlsx | Tttt_Dddddd_Rrrrrr.xlsx |
#My Code that I hobbled together from other posts
$path = “c:\\folder 1”
Get-ChildItem -Path $Path –File | ForEach-Object {
$items = $_.BaseName –split “_”
$newFileName = ($items\[1..($items.Length – 2)\] -join “_”) + $_.Extension
Rename-Item –Path $_.FullName -NewName $newFileName
}
Get-ChildItem -File $Path | Rename-Item –NewName {$_.name -replace “\_”, “ “}
到目前为止,我可以让大多数东西都正常工作,但以下几点:
- 出于某种原因,对于下面的文件,代码似乎删除了比它应该删除的更多的字符。如果删除了任何一个文件,则所有文件都可以工作。
- 如果没有日期时间戳,则不删除字符
- 如果文件重复,请跳过/重命名文件
答:
这种方法可能适合您,但目前尚不清楚您的时间戳的有效格式是什么。此示例中的正则表达式将时间戳视为以 14 位数字后跟下划线开头的任何文件,此筛选还将排除任何不满足此匹配条件的文件,因此不会重命名。
使用相关文件名的演示:
$test = [System.IO.FileInfo[]] @(
'20231025041306_LLLLL_Aaaaaaaaaa_7777d4cb-6666631f-fa38-473e-a650-d3564505a2075.xls'
'20231025041406_LLLLL_Aaaaaaaaaa_8777befd-87765c3e-3164-4800-b102-a82d48aaaa52.xlsx'
'20231025041436_LLLLL_Aaaaaaaaaa_73d2bbbc.PDF'
'20231025041518_LLLLL_Aaaaaaaaaa_210zzz2c.csv'
'20231025041613_LLLLL_Aaaaaaaaaa_aqqqq1ad.txt'
'20231025041906_cccc_dddddd_rrrrrr_a6fff0d3.xls'
'20231025041935_cccc_dddddd_rrrrrr_f37ggg89.pdf'
'20231025042000_cccc_dddddd_rrrrrr_9e812343.csv'
'20231025042026_cccc_dddddd_rrrrrr_d7522280.txt'
'20231025042229_LllllAaaaaaaa_OO_OoooTttt_37gggd7-5e81ffhgedc77-4c8e-9fbc-d2996ggg0df1.xls'
'20231025042254_LllllAaaaaaaa_OO_OoooTttt_4fjjjfrgb-e3ec7993-92d7-4ab8-ad9e-83ejjjjj929b.xlsx'
'20231025042329_LllllAaaaaaaa_OO_OoooTttt_c0fkkkkf2.pdf'
'20231025042410_LllllAaaaaaaa_OO_OoooTttt_b555tefd7f.csv'
'20231025042505_LllllAaaaaaaa_OO_OoooTttt_9784g07e.txt'
'20231025042747_Ppppp_Rrrrr_Rrrrrr_2902e487-cc3c6chhhh074-4a2e-a97f-bfa0000a062e.xls'
'20231025042813_Ppppp_Rrrrr_Rrrrrr_aab84122-2fzzzz68-a706-49a5-a3ef-40030ffff0a3.xlsx'
'20231025042842_Ppppp_Rrrrr_Rrrrrr_79cdgggd2.PDF'
'20231025042923_Ppppp_Rrrrr_Rrrrrr_f07yyya8f.csv'
'20231025043220_Tttt_Dddddd_Rrrrrr_2444gr18d-13b4fb14-8fc2-45e0-b18b-59jkh6353d78.xlsx'
)
$test |
Where-Object BaseName -Match '(?<=^[0-9]{14}_).+(?=_)' |
ForEach-Object { $Matches[0].Replace('_', ' ') + $_.Extension }
这将输出:
LLLLL Aaaaaaaaaa.xls
LLLLL Aaaaaaaaaa.xlsx
LLLLL Aaaaaaaaaa.PDF
LLLLL Aaaaaaaaaa.csv
LLLLL Aaaaaaaaaa.txt
cccc dddddd rrrrrr.xls
cccc dddddd rrrrrr.pdf
cccc dddddd rrrrrr.csv
cccc dddddd rrrrrr.txt
LllllAaaaaaaa OO OoooTttt.xls
LllllAaaaaaaa OO OoooTttt.xlsx
LllllAaaaaaaa OO OoooTttt.pdf
LllllAaaaaaaa OO OoooTttt.csv
LllllAaaaaaaa OO OoooTttt.txt
Ppppp Rrrrr Rrrrrr.xls
Ppppp Rrrrr Rrrrrr.xlsx
Ppppp Rrrrr Rrrrrr.PDF
Ppppp Rrrrr Rrrrrr.csv
Tttt Dddddd Rrrrrr.xlsx
如果这是你要找的,那么最终的代码将是:
Get-ChildItem path\to\theFiles -File |
Where-Object BaseName -Match '(?<=^[0-9]{14}_).+(?=_)' |
Rename-Item -NewName { $Matches[0].Replace('_', ' ') + $_.Extension }
有关正则表达式的详细信息,另请参阅 https://regex101.com/r/YMd0IS/1。
评论
yyyyMMddHHmmss