提问人:Hack-R 提问时间:10/7/2020 最后编辑:Hack-R 更新时间:10/7/2020 访问量:101
在 Bash 中有条件地从类似 JSON 的键值对中获取 grep 值
Conditionally grep value from JSON-like key-value pairs in Bash
问:
我正在使用返回 JSON 数据的 API。数据通常在末尾缺少几个字符,因此从技术上讲,它是“类似 JSON”的,因为它的格式略有错误。
我能够在我的 Bash 脚本中使用这样从中提取感兴趣的字段:grep
grep -Po '"username": *\K"[^"]*"' jsonraw > jsonclean
尽管 JSON 被略微截断,但工作正常。唯一的问题是它返回每条记录,而我想让它以另一个键值对为条件。
例如,我希望它仅在字段为 时返回值,否则只需跳过记录。一些表示这一点的伪代码可能如下所示:username
activity_count
>=1
if '"activity_count":' >=1 grep -Po '"username": *\K"[^"]*"' jsonraw > jsonclean
我意识到这可能是一个更简单的选择,但由于 JSON 数据的格式不正确和其他原因,我更愿意坚持下去。jq
grep
示例数据:
[
{"id":"37da1db11b6b4977902baa286f88bf05","activity_count":0,"blocked":false,"coverPhoto":"cb861013bdcc4e5f9e2a93394a7b4309","followed":true,"human":true,"integration":false,"joined":"20190602125229","muted":false,"name":"AV8R","rss":false,"private":false,"profilePhoto":"511d4625df2442fc9b02ab4279c28f09","subscribed":false,"username":"APALMER66","verified":false,"verifiedComments":false,"badges":[0],"score":"1.4k","interactions":259},{"id":"525f9e87bb2d4f4184d12037050afc8d","activity_count":2,"blocked":false,"coverPhoto":"b0bbb4dec22f40d6a347dfb666ff0158","followed":true,"human":true,"integration":false,"joined":"20200627154134","muted":false,"name":"DeziRay","rss":false,"private":false,"profilePhoto":"86627047425844fcbf921e53fc71d106","subscribed":false,"username":"Deziray","verified":false,"verifiedComments":false,"badges":[0],"score":"4.7k","interactions":259},
预期输出:
Deziray
答:
1赞
Charles Duffy
10/7/2020
#1
首先(因为它更容易),一个答案:jq
jq -nr --stream '
fromstream(1|truncate_stream(inputs))
| select(.activity_count >= 1)
| .username
' <test.json
因为它在流模式下运行,所以它甚至能够处理截断的文档。
评论
0赞
Hack-R
10/7/2020
这太好了,谢谢。如果有人有一个仍然有用的答案(我可能不得不越过绿色复选标记 - 但这现在有效,再次感谢)。grep
1赞
Charles Duffy
10/7/2020
#2
作为原生 Python 实现,依赖于现代 Python 3.x 运行时:
#!/usr/bin/env python3
import json, sys
def found_obj_cb(item):
if item.get('activity_count', 0) >= 1 and 'username' in item:
print(item['username'])
return item
try:
json.load(sys.stdin, object_hook=found_obj_cb)
except json.JSONDecodeError:
pass
...在 shell 中用作:
#!/usr/bin/env bash
json_parse_py=$(cat <<'EOF'
import json, sys
def found_obj_cb(item):
if item.get('activity_count', 0) >= 1 and 'username' in item:
print(item['username'])
return item
try:
json.load(sys.stdin, object_hook=found_obj_cb)
except json.JSONDecodeError:
pass
EOF
)
# define a shell function to wrap the Python code
json_parse() { python3 -c "$json_parse_py" "$@"; }
# actually call it, with test.json on stdin
json_parse <test.python
评论
jq
grep
grep