提问人:Ne Mo 提问时间:11/8/2023 最后编辑:Ne Mo 更新时间:11/9/2023 访问量:60
使用 Powershell,如何筛选 JSON 以排除某些键名称?
Using Powershell, how do I filter a JSON to exclude certain key names?
问:
我正在尝试减小 JSON 的大小,即 700MB。这是一个稍微小一点的版本:https://kaikki.org/dictionary/All%20languages%20combined/by-pos-name/kaikki_dot_org-dictionary-all-by-pos-name.json
我通过删除不必要的信息来做到这一点。我不需要的键是:hypernyms,pos,categories,alt_of,inflection_templates,hyponyms,meronyms,source,wikipedia,holonyms,proverbs,head_templates,etymology_text,lang_code,hyphenation,forms,synonyms,antionyms。
我试过了
$Obj =
[System.IO.File]::ReadLines((Convert-Path -LiteralPath namesonly.json)) |
ConvertFrom-Json
$foo = Select-Object $Obj -ExcludeProperty hypernyms,pos,categories,alt_of,inflection_templates,hyponyms,meronyms,source,wikipedia,holonyms,proverbs,head_templates,etymology_text,lang_code,hyphenation,forms,synonyms,antonyms
$foo | ConvertTo-Json -Depth 100 > namesonlycleaned.json
但这会导致一个空文件。我该如何修复它,以便我得到一个没有那些不必要的字段的新 JSON?
编辑:建议在评论中添加星号 - 如果我做对了
$Obj =
[System.IO.File]::ReadLines((Convert-Path -LiteralPath namesonly.json)) |
ConvertFrom-Json
$foo = Select-Object $Obj * -ExcludeProperty hypernyms,pos,categories,alt_of,inflection_templates,hyponyms,meronyms,source,wikipedia,holonyms,proverbs,head_templates,etymology_text,lang_code,hyphenation,forms,synonyms,antonyms
$foo | ConvertTo-Json -Depth 100 > namesonlycleaned.json
返回错误
A positional parameter cannot be found that accepts argument '*'.
答:
你眼前的问题是Mathias R. Jessen指出的问题:
不幸的是,在 Windows PowerShell 中,单独使用
Select-Object
无法按预期工作(输出空对象),并且需要结合使用 - 此问题已在 PowerShell (Core) 7+ 中修复-ExcludeProperty
-Property *
必须通过管道向以下对象提供输入对象:
Select-Object
$Obj | Select-Object -Property * -ExcludeProperty hypernyms,pos,categories,alt_of,inflection_templates,hyponyms,meronyms,source,wikipedia,holonyms,proverbs,head_templates,etymology_text,lang_code,hyphenation,forms,synonyms,antonyms
但是,仅凭这一点并不能解决您的问题:
从链接的数据源和您尝试排除的属性数组来看,其中一些属性是嵌套对象的属性,即您希望从每个对象的对象图中删除属性。
Select-Object
不支持此功能,但自定义Remove-Property
函数(底部的源代码)支持此功能。
使用以下命令(确保已先从底部定义函数):Remove-Property
[System.IO.File]::ReadLines((Convert-Path -LiteralPath large.json)) |
ConvertFrom-Json |
Remove-Property -Recurse -Property hypernyms,pos,categories,alt_of,inflection_templates,hyponyms,meronyms,source,wikipedia,holonyms,proverbs,head_templates,etymology_text,lang_code,hyphenation,forms,synonyms,antonyms |
ConvertTo-Json -Compress -Depth 100 > namesonlycleaned.json
注意:
这将运行相当长一段时间,但通过使用单个管道,它可以避免由于结果的中间存储而导致不必要的内存使用。
- 也就是说(至少从 PowerShell 7.4 开始),在生成输出之前预先读取所有输入;然而,就运行时性能而言,这部分完成得相当快。
ConvertFrom-Json
- 也就是说(至少从 PowerShell 7.4 开始),在生成输出之前预先读取所有输入;然而,就运行时性能而言,这部分完成得相当快。
对于故障排除(例如,将输出限制为前 10 个对象),可以在段之前作为管道段插入。
Select-Object -First 10
ConvertTo-Json
Remove-Property
源代码:
function Remove-Property {
<#
.SYNOPSIS
Removes properties from [pscustomobject] or dictionary objects (hashtables)
and outputs the resulting objects.
.DESCRIPTION
Use -Recurse to remove the specified properties / entries from
the entire object *graph* of each input object, i.e. also from any *nested*
[pscustomobject]s or dictionaries.
Useful for removing unwanted properties / entries from object graphs parsed
from JSON via ConvertFrom-Json.
Attempts to remove non-existent properties / entries are quietly ignored.
.EXAMPLE
[pscustomobject] @{ foo=1; bar=2 } | Remove-Property foo
Removes the 'foo' property from the given custom object and outputs the result.
.EXAMPLE
@{ foo=1; bar=@{foo=10; baz=2} } | Remove-Property foo -Recurse
Removes 'foo' properties (entries) from the entire object graph, i.e. from
the top-level hashtable as well as from any nested hashtables.
#>
param(
[Parameter(Mandatory, Position = 0)] [string[]] $Property,
[switch] $Recurse,
[Parameter(Mandatory, ValueFromPipeline)] [object] $InputObject
)
process {
if (-not (($isPsCustObj = $InputObject -is [System.Management.Automation.PSCustomObject]) -or $InputObject -is [System.Collections.IDictionary])) { Write-Error "Neither a [pscustomobject] nor an [IDictionary] instance: $InputObject"; return }
# Remove the requested properties from the input object itself.
foreach ($propName in $Property) {
# Note: In both cases, if a property / entry by a given name doesn't exist, the .Remove() call is a quiet no-op.
if ($isPsCustObj) {
$InputObject.psobject.Properties.Remove($propName)
}
else {
# IDictionary
$InputObject.Remove($propName)
}
}
# Recurse, if requested.
if ($Recurse) {
if ($isPsCustObj) {
foreach ($prop in $InputObject.psobject.Properties) {
if ($prop.Value -is [System.Management.Automation.PSCustomObject] -or $prop.Value -is [System.Collections.IDictionary]) {
$prop.Value = Remove-Property -InputObject $prop.Value -Recurse -Property $Property
}
}
}
else {
# IDictionary
foreach ($entry in $InputObject.GetEnumerator()) {
if ($entry.Value -is [System.Management.Automation.PSCustomObject] -or $entry.Value -is [System.Collections.IDictionary]) {
$entry.Value = Remove-Property -InputObject $entry.Value -Recurse -Property $Property
}
}
}
}
$InputObject # Output the potentially modified input object.
}
}
评论
$foo = Select-Object $Obj -ExcludeProperty ...
$foo = $Obj |Select-Object * -ExcludeProperty ...
Select-Object $Obj * -ExcludeProperty
$Obj |Select-Object * -ExcludeProperty