提问人:sirducas 提问时间:6/13/2022 更新时间:6/15/2022 访问量:498
如何计算由 2 个字段分隔符(“:” 和 “,”)分隔的行中单列的字段数?
how to count number of fields for a single column in a row separated by 2 field separators (":" and ",")?
问:
给定文本文件:
(结构为:“group_name:PW:group_id:User1<,User2>...”)
adm:x:4:syslog,adm1
admins:x:1006:adm2,adm12,manuel
ssl-cert:x:122:postgres
ala2:x:1009:aceto,salvemini
conda:x:1011:giovannelli,galise,aceto,caputo,haymele,salvemini,scala,adm2,adm12
adm1Group:x:1022:adm2,adm1,adm3
docker:x:998:manuel
如何计算每条线路的用户数?还是单行?
例如,如果我想知道有多少用户包含“adm1Group”,则输出应为 3,因为 adm1Group 有三个用户(adm2、adm1 和 adm3)。另一个示例是第一行(组名“adm”),包含两个用户:syslog 和 adm1。
主要问题是这里有两个字段分隔符,那么我如何在同一个 AWK 命令中分隔 $4 列?我有这个解决方案,但在这里我使用了两个与管道链接的不同 awk 命令,如下所示(我不知道这对内核来说是否正确或“合法”):
awk -F: '/adm1Group/ {print $4}' file.txt | awk -F, 'BEGIN {printf "N. of users in adm1Group = "} {print NF}'
我可以在单个 AWK 命令中实现这样的解决方案吗?如果没有,我可以用这个吗?还是这个解决方案是“不良做法”?
答:
您可以为此使用:split
awk -F: '$1 == "adm1Group" {print split($NF, a, /,/)}' file
3
awk -F: '$1 == "conda" {print split($NF, a, /,/)}' file
9
或者将它们全部打印在一起:
awk -F: '{print split($NF, a, /,/), "no of users in adm1Group:", $1}' file
2 no of users in adm1Group: adm
3 no of users in adm1Group: admins
1 no of users in adm1Group: ssl-cert
2 no of users in adm1Group: ala2
9 no of users in adm1Group: conda
3 no of users in adm1Group: adm1Group
1 no of users in adm1Group: docker
评论
9
使用您展示的示例和尝试,请尝试以下代码。这将打印Input_file的每个组名称中存在的用户总数。awk
awk -F':' '
{
num=0
arr1[$1]=num=split($NF,arr2,",")
}
END{
for(i in arr1){
print "Group " i " has " arr1[i] " users."
}
}
' Input_file
解释:为上述代码添加详细说明。
awk -F':' ' ##Starting awk program where setting field separator as : here.
{
num=0 ##Setting num as 0 here.
arr1[$1]=num=split($NF,arr2,",") ##Creating arr1 array with index of $1 and has value of num, which contains total number of total elements in arr2 with delimiter of , here.
}
END{ ##Starting END block of this program from here.
for(i in arr1){ ##Traversing through arr1 here.
print "Group " i " has " arr1[i] " users." ##printing group name and its value(how many times users came for that group).
}
}
' Input_file ##Mentioning Input_file name here.
评论
split($NF,arr2,",")
,
num
,
如何计算每条线路的用户数?或单个 线?
我会用 GNU 来计算第 4 个字段内的数量,然后增加它,让内容为AWK
,
1
file.txt
adm:x:4:syslog,adm1
admins:x:1006:adm2,adm12,manuel
ssl-cert:x:122:postgres
ala2:x:1009:aceto,salvemini
conda:x:1011:giovannelli,galise,aceto,caputo,haymele,salvemini,scala,adm2,adm12
adm1Group:x:1022:adm2,adm1,adm3
docker:x:998:manuel
然后
awk 'BEGIN{FS=":"}{printf "N of users in %s is %s\n", $1, gsub(/,/,"",$4)+1}' file.txt
给出输出
N of users in adm is 2
N of users in admins is 3
N of users in ssl-cert is 1
N of users in ala2 is 2
N of users in conda is 9
N of users in adm1Group is 3
N of users in docker is 1
解释:我告诉GNU字段分隔符()是。对于我使用的每一行,它的作用类似于填充模板和打印,对于填充,我使用第 1 个字段 () 和当命令使用空字符串 () 替换时完成的更改次数 gsub
函数在第 4 个字段 () 增加 1(因为姓氏没有尾随)。请注意,这确实会改变(删除字符),但对于此任务,所述副作用无关紧要。请注意,使用时需要隐式提供换行符 (),而不是 .AWK
FS
:
printf
$1
,
""
$4
,
$4
,
printf
\n
print
(在 Gawk 4.2.1 中测试)
评论
,
""
$4
1
{n=gsub(/,/,"",$4);printf "N of users in %s is %s\n", $1, n==0?0:n+1}
使用 或 作为字段分隔符,然后打印字段数减去 3 个前导字段::
,
awk -F'[:,]' '{print $1, NF - 3}' file
awk -F'[:,]' -v group=conda '$1 == group {print NF - 3}' file
评论
3
split
{m,g}awk '$!NF=sprintf("%20s\47s user(s) count = %\0478.f",$!_,NF-!_)' FS=':.+:|,'
adm's user(s) count = 2
admins's user(s) count = 3
ssl-cert's user(s) count = 1
ala2's user(s) count = 2
conda's user(s) count = 9
adm1Group's user(s) count = 3
docker's user(s) count = 1
稍加修改,现在完整的用户列表将在尾部提供。具体来说,小粗体项目 - 现在它正在覆盖而不是 ::$1
$0
{m,g}awk '
$!_
= sprintf("%15s\47s user(s) count = %\0476.f",$!_,NF-!_)' FS=':.+:|,'
adm's user(s) count = 2 syslog adm1
admins's user(s) count = 3 adm2 adm12 manuel
ssl-cert's user(s) count = 1 postgres
ala2's user(s) count = 2 aceto salvemini
conda's user(s) count = 9 giovannelli galise aceto caputo haymele salvemini scala adm2 adm12
adm1Group's user(s) count = 3 adm2 adm1 adm3
docker's user(s) count = 1 manuel
评论
!NF
0
!_
1
mawk
$0
$1 = $0 = …
$!NF
mawk
echo 'abc' | gawk '$_ = 0'
echo 'abc' | gawk '$_ = "0"'
"0"
ASCII
awk
评论