bash + 如何从长输出中捕获单词-解网

问：

我从以下命令中获得以下输出

zookeeper-shell.sh 19.2.6.4  get /brokers/ids/1010

输出为

Connecting to 19.2.6.4

WATCHER::

WatchedEvent state:SyncConnected type:None path:null
{"listener_security_protocol_map":{"PLAINTEXT":"PLAINTEXT"},"endpoints":["PLAINTEXT://kafka1.lulu.com:6667"],"rack":"/default-rack","jmx_port":9997,"port":6667,"host":"kafka1.lulu.com","version":4,"timestamp":"1630507307906"}

主要目标是从上面的输出中捕获机器名称kafka1

所以我成功地做了以下长命令语法

zookeeper-shell.sh 119.2.6.4  get /brokers/ids/1010 | sed s'/\/\// /g' | sed s'/:/ /g' | sed s'/,/ /g' | sed s'/"/ /g' | sed s'/\./ /g'| awk '{for (i=1;i<=NF;i++) print $i}' | grep -i kafka | sort | uniq

结果是：（作为预期结果）

kafka1

问题是我对我的方法感到难过，它太长了，不那么优雅

我们能从我的语法中得到更好的建议（使用 awk/sed/perl 一行）吗？

与正则表达式语言无关

awk '                                         ##Starting awk program from here.
/WatchedEvent state/{                         ##Checking condition if line contains WatchedEvent state
  found=1                                     ##Then set found to 1 here.
  next                                        ##next will skip all further statements from here.
}
found && match($0,/"PLAINTEXT:\/\/[^:]*/){    ##Checking condition if found is SET then match regex "PLAINTEXT:\/\/[^:]* in match function of awk.
  print substr($0,RSTART+13,RLENGTH-13)       ##Printing sub string of matched regex used in match function above.
}
'

4赞 Ed Morton 9/3/2021 #2

您要解析的文本是 JSON，因此请使用 JSON 感知工具，例如用于大部分工作，例如，由于我没有用于生成输出的命令，因此使用 Since I don't have the command you use to produce output：jqcat file

$ cat file | jq -Rr 'fromjson? | .endpoints[]'
PLAINTEXT://kafka1.lulu.com:6667

$ cat file | jq -Rr 'fromjson? | .endpoints[]' | awk -F'[/.]' '{print $3}'
kafka1

2赞 Ronaldo Ferreira de Lima 9/3/2021 #3

使用 perl，您可以执行以下操作：

$zookeeper_command | perl -MJSON::PP=decode_json -wnE'/^\{"/ or next; $j = decode_json($_); ($s) = (split /\./, $j->{host})[0]; say $s'

详细说明命令：

-MJSON::PP=decode_json=> 从模块导入（它是一个核心模块。decode_jsonJSON::PP
/^\{"/ or next;=> 跳过行看起来不像 JSON 字符串。
$j = decode_json($_);=> 存储到 json 字符串的数据结构中。$j
($s) = (split /\./, $j->{host})[0];=>拆分字符串并仅存储在第一部分。kafka1.lulu.com$s

它也可以用更短的形式编写（而且可读性也较差）：

$zookeeper_command | perl -MJSON::PP=decode_json -wnE'say decode_json($_)->{host}=~s/\..*$//r if/^\{"/'

2赞 Polar Bear 9/3/2021 #4

您可以使用以下脚本来过滤掉感兴趣的数据，这样可以避免键入冗长的命令行。

use strict;
use warnings;
use feature 'say';

use JSON;

my $data;

while( <> ) {
    next unless /^\{.*?\}$/;   # skip all but JSON string
    
    my $data = from_json($_);  # restore data structure
    my $host = (split('\.',$data->{host}))[0]; # extract info of interest
    
    say $host;                 # output it
}

以 .zookeeper-shell.sh 19.2.6.4 get /brokers/ids/1010 | script.pl

注意：使脚本可执行并将其存储在您的目录中，该目录将添加到您的变量中。chmod +x script.pl$HOME/bin$PATH

上一个：检测矩形交叉圆

下一个：正则表达式查找不连续的重复单词（即在字符串中多次出现）

bash + 如何从长输出中捕获单词

bash + how to capture word from a long output

评论