PowerShell - 在下划线之前和之后修剪文件名适用于小文件集

PowerShell - Trim File Names Before and After Underscores Works For Small Sets of Files

提问人:Muldoon 提问时间:11/4/2023 最后编辑:litMuldoon 更新时间:11/4/2023 访问量:28

问:

这是我第一次尝试使用 PowerShell,但我有一个文件夹,其中包含数百个文件,如下所示。

  • 时间戳#1_File_Name_of_the_Item_a650-d322205a2071.xls
  • 时间戳#1_File_Name_of_the_Item_a650-d3b580442072.txt
  • 时间戳#1_File_Name_of_the_Item_a650-d3bhnf5a2073.xlsx
  • 时间戳#1_File_Name_of_the_Item_a650-d3b523da2074.csv
  • 时间戳#2_File_Name_of_the_Item_a650-d3bbvx5a2075.xls
  • File_Name_of_the_Item_a650-D3DDFE5A2075.pdf

我的最终目标是:

  1. 修剪之前的所有内容,包括第一个下划线

    a. 除非它不是以日期时间戳开头

  2. 修剪后面的所有内容,包括最后一个下划线,但保留扩展名

  3. 将剩余的下划线替换为空格

  4. 了解为什么我不能将现有代码用于以下文件

20231025041306_LLLLL_Aaaaaaaaaa_7777d4cb-6666631F-FA38-473E-A650-D3564505A2075.xls 20231025041406_LLLLL_Aaaaaaaaaa_8777befd-87765C3E-3164-4800-B102-A82D48AAAA52.xlsx 20231025041436_LLLLL_Aaaaaaaaaa_73d2bbbc.PDF 20231025041518_LLLLL_Aaaaaaaaaa_210zzz2c.csv 20231025041613_LLLLL_Aaaaaaaaaa_aqqqq1ad.txt 20231025041906_cccc_dddddd_rrrrrr_a6fff0d3.xls 20231025041935_cccc_dddddd_rrrrrr_f37ggg89.pdf 20231025042000_cccc_dddddd_rrrrrr_9e812343.csv 20231025042026_cccc_dddddd_rrrrrr_d7522280.txt 20231025042229_LllllAaaaaaaa_OO_OoooTttt_37gggd7-5e81ffhgedc77-4c8e-9fbc-d2996ggg0df1.xls 20231025042254_LllllAaaaaaaa_OO_OoooTttt_4fjjjfrgb-e3ec7993-92d7-4ab8-ad9e-83ejjjjjj929b.xlsx 20231025042329_LllllAaaaaaaa_OO_OoooTttt_c0fkkkkf2.pdf 20231025042410_LllllAaaaaaaa_OO_OoooTttt_b555tefd7f.csv 20231025042505_LllllAaaaaaaa_OO_OoooTttt_9784g07e.txt 20231025042747_Ppppp_Rrrrr_Rrrrrr_2902e487-cc3c6chhhh074-4a2e-a97f-bfa0000a062e.xls 20231025042813_Ppppp_Rrrrr_Rrrrrr_aab84122-2fzzzz68-a706-49a5-a3ef-40030ffff0a3.xlsx 20231025042842_Ppppp_Rrrrr_Rrrrrr_79cdgggd2.PDF 20231025042923_Ppppp_Rrrrr_Rrrrrr_f07yyya8f.csv 20231025043220_Tttt_Dddddd_Rrrrrr_2444gr18d-13b4fb14-8fc2-45e0-b18b-59jkh6353d78.xlsx

出于某种原因,如果我尝试使用所有文件编写代码,那么我会得到左侧的结果。但是,如果我删除任何一个文件,那么我会在右侧获得正确的结果。

一气呵成 删除了一个文件
哎呀呀呀.xls呀 哎呀呀呀.xls呀
哎呀呀呀.xlsx呀 哎呀呀呀.xlsx呀
哎呀呀呀.pdf呀 哎呀呀呀.pdf呀
哎呀呀呀.csv呀 哎呀呀呀.csv呀
哎呀呀.txt呀 哎呀呀.txt呀
dddddd.xls CCCC dddddd rrrrrr.xls
哒.pdf CCCC dddddd rrrrrr.pdf
dddddd.csv 中国交建 dddddd rrrrrr.csv
哒.txt CCCC dddddd rrrrrr.txt
OO.csv LllllAaaaaaaa OO OoooTttt.csv
OO.pdf 哎呀呀.pdf呀
OO.txt LllllAaaaaaaa OO OoooTttt.txt
OO.xls 哎呀呀.xls呀
OO的.xlsx 哎呀呀.xlsx呀
呜.csv pp rr rr.csv
呜.pdf 啪.pdf啪
rrrr.xls pp rr rr.xls
rrrr.xlsx 啪.xlsx啪
ttt ddddd rrrrrr.xlsx Tttt_Dddddd_Rrrrrr.xlsx
#My Code that I hobbled together from other posts
$path = “c:\\folder 1”

Get-ChildItem -Path $Path –File | ForEach-Object {
$items = $_.BaseName –split “_”
$newFileName = ($items\[1..($items.Length – 2)\] -join “_”) + $_.Extension
Rename-Item –Path $_.FullName -NewName $newFileName
}
Get-ChildItem -File $Path | Rename-Item –NewName {$_.name -replace “\_”, “ “}

到目前为止,我可以让大多数东西都正常工作,但以下几点:

  1. 出于某种原因,对于下面的文件,代码似乎删除了比它应该删除的更多的字符。如果删除了任何一个文件,则所有文件都可以工作。
  2. 如果没有日期时间戳,则不删除字符
  3. 如果文件重复,请跳过/重命名文件
PowerShell 特殊 文件重命名 字符修剪

评论

0赞 Santiago Squarzon 11/4/2023
什么是有效的时间戳?只有还是可以有不同的格式?如果是这样,您需要澄清有效的格式yyyyMMddHHmmss
0赞 Luuk 11/4/2023
“如果文件重复,请跳过/重命名文件”,请参阅:检查 Windows PowerShell 中是否存在文件?

答:

1赞 Santiago Squarzon 11/4/2023 #1

这种方法可能适合您,但目前尚不清楚您的时间戳的有效格式是什么。此示例中的正则表达式将时间戳视为以 14 位数字后跟下划线开头的任何文件,此筛选还将排除任何不满足此匹配条件的文件,因此不会重命名。

使用相关文件名的演示:

$test = [System.IO.FileInfo[]] @(
    '20231025041306_LLLLL_Aaaaaaaaaa_7777d4cb-6666631f-fa38-473e-a650-d3564505a2075.xls'
    '20231025041406_LLLLL_Aaaaaaaaaa_8777befd-87765c3e-3164-4800-b102-a82d48aaaa52.xlsx'
    '20231025041436_LLLLL_Aaaaaaaaaa_73d2bbbc.PDF'
    '20231025041518_LLLLL_Aaaaaaaaaa_210zzz2c.csv'
    '20231025041613_LLLLL_Aaaaaaaaaa_aqqqq1ad.txt'
    '20231025041906_cccc_dddddd_rrrrrr_a6fff0d3.xls'
    '20231025041935_cccc_dddddd_rrrrrr_f37ggg89.pdf'
    '20231025042000_cccc_dddddd_rrrrrr_9e812343.csv'
    '20231025042026_cccc_dddddd_rrrrrr_d7522280.txt'
    '20231025042229_LllllAaaaaaaa_OO_OoooTttt_37gggd7-5e81ffhgedc77-4c8e-9fbc-d2996ggg0df1.xls'
    '20231025042254_LllllAaaaaaaa_OO_OoooTttt_4fjjjfrgb-e3ec7993-92d7-4ab8-ad9e-83ejjjjj929b.xlsx'
    '20231025042329_LllllAaaaaaaa_OO_OoooTttt_c0fkkkkf2.pdf'
    '20231025042410_LllllAaaaaaaa_OO_OoooTttt_b555tefd7f.csv'
    '20231025042505_LllllAaaaaaaa_OO_OoooTttt_9784g07e.txt'
    '20231025042747_Ppppp_Rrrrr_Rrrrrr_2902e487-cc3c6chhhh074-4a2e-a97f-bfa0000a062e.xls'
    '20231025042813_Ppppp_Rrrrr_Rrrrrr_aab84122-2fzzzz68-a706-49a5-a3ef-40030ffff0a3.xlsx'
    '20231025042842_Ppppp_Rrrrr_Rrrrrr_79cdgggd2.PDF'
    '20231025042923_Ppppp_Rrrrr_Rrrrrr_f07yyya8f.csv'
    '20231025043220_Tttt_Dddddd_Rrrrrr_2444gr18d-13b4fb14-8fc2-45e0-b18b-59jkh6353d78.xlsx'
)

$test |
    Where-Object BaseName -Match '(?<=^[0-9]{14}_).+(?=_)' |
    ForEach-Object { $Matches[0].Replace('_', ' ') + $_.Extension }

这将输出:

LLLLL Aaaaaaaaaa.xls
LLLLL Aaaaaaaaaa.xlsx
LLLLL Aaaaaaaaaa.PDF
LLLLL Aaaaaaaaaa.csv
LLLLL Aaaaaaaaaa.txt
cccc dddddd rrrrrr.xls
cccc dddddd rrrrrr.pdf
cccc dddddd rrrrrr.csv
cccc dddddd rrrrrr.txt
LllllAaaaaaaa OO OoooTttt.xls
LllllAaaaaaaa OO OoooTttt.xlsx
LllllAaaaaaaa OO OoooTttt.pdf
LllllAaaaaaaa OO OoooTttt.csv
LllllAaaaaaaa OO OoooTttt.txt
Ppppp Rrrrr Rrrrrr.xls
Ppppp Rrrrr Rrrrrr.xlsx
Ppppp Rrrrr Rrrrrr.PDF
Ppppp Rrrrr Rrrrrr.csv
Tttt Dddddd Rrrrrr.xlsx

如果这是你要找的,那么最终的代码将是:

Get-ChildItem path\to\theFiles -File |
    Where-Object BaseName -Match '(?<=^[0-9]{14}_).+(?=_)' |
    Rename-Item -NewName { $Matches[0].Replace('_', ' ') + $_.Extension }

有关正则表达式的详细信息,另请参阅 https://regex101.com/r/YMd0IS/1

评论

1赞 Muldoon 11/6/2023
谢谢!这绝对让我走上了一条与我原来要去的地方不同的道路!:)