提问人:Deamoon 提问时间:9/2/2022 最后编辑:Deamoon 更新时间:9/5/2022 访问量:60
重构 json 数据
Restructure json data
问:
我有一个具有以下结构的 JSON:
{
"id": 2,
"image_id": 2,
"segmentation": [
[
913.0,
659.5,
895.0,
],
[
658.5,
875.0,
652.5,
659.5
],
],
"iscrowd": 0,
"bbox": [
4.5,
406.5,
1098.0,
1096.0
],
"area": 579348.0,
"category_id": 0
},
现在我需要将每个条目拆分为两个单独的条目,如下所示:
{
"id": 2,
"image_id": 2,
"segmentation": [
[
658.5,
875.0,
652.5,
659.5
],
],
"iscrowd": 0,
"bbox": [
4.5,
406.5,
1098.0,
1096.0
],
"area": 579348.0,
"category_id": 0
},
{
"id": 3,
"image_id": 2,
"segmentation": [
[
913.0,
659.5,
895.0,
],
],
"iscrowd": 0,
"bbox": [
4.5,
406.5,
1098.0,
1096.0
],
"area": 579348.0,
"category_id": 0
},
因此,每个新条目都具有与原始条目相同的image_id和 iscrowd、bbox、area & category_id,但会获得新的(增量)id,并且只有一个段:[] 。因此,如果原始条目有 15 个细分,则代码会将其拆分为 15 个具有唯一 ID 的条目。
有什么提示吗?我发现了一些关于如何根据键值进行合并的帖子,但没有找到如何拆分的帖子。
答:
0赞
iamtrappedman
9/5/2022
#1
import json
new_json = []
ids = 0
for i in original_json:
segms = i["segmentation"]
for j in segms:
dummy = {}
for k in i:
dummy[k] = i[k]
dummy["id"] = ids
dummy["segmentation"] = j
ids+=1
new_json.append(dummy)
with open("new_json_file.json", 'w') as f:
json.dump(new_json, f)
希望这会有所帮助
评论
0赞
Deamoon
9/5/2022
谢谢。我已经尝试过,但是我在 segms = original_json[“segmentation”] 行上收到“列表索引必须是整数或切片,而不是 str”错误
0赞
iamtrappedman
9/5/2022
您如何尝试读取原始 JSON 文件?尝试segms = original_json[0]["segmentation"]
0赞
Deamoon
9/5/2022
使用作品读取原始 JSON,但输出的 JSON 具有相同的结构,没有拆分分段segms = original_json[0]["segmentation"]
0赞
Deamoon
9/5/2022
因此,似乎导致 segms 只是第一个分段的浮点数列表(来自我的例子)segms = original_json[0]["segmentation"]
segms = [[658.5, 875.0, 652.5, 659.5]]
0赞
iamtrappedman
9/5/2022
你能分享你的新产出吗?我也没有做任何事情来增加 ID,你也需要实现它。
0赞
Deamoon
9/5/2022
#2
因此,@iamtrappedman提供的代码是有效的:
test_loc = "/content/TEST.json"
with open(test_loc) as j_f:
original_json = json.load(j_f)
segms = original_json[0]["segmentation"]
new_json = []
for i in segms:
original_json[0]["segmentation"] = i
new_json.append(original_json)
with open("new_json_file.json", "w") as f:
json.dump(new_json, f,indent=4)
如果我输入以下 JSON:
[
{
"id": 0,
"image_id": 0,
"segmentation": [
[
465.0,
1198.5,
432.0,
1190.5
],
[
525.0,
2424.5,
1257.0,
2578.5
]
],
"iscrowd": 0,
"bbox": [
0.5,
407.5,
869.0,
791.0
],
"area": 425968.25,
"category_id": 0
}
]
我得到一个拆分的 JSON,但两个条目是相同的:
[
[
{
"area": 425968.25,
"bbox": [
0.5,
407.5,
869.0,
791.0
],
"category_id": 0,
"id": 0,
"image_id": 0,
"iscrowd": 0,
"segmentation": [
525.0,
2424.5,
1257.0,
2578.5
]
}
],
[
{
"area": 425968.25,
"bbox": [
0.5,
407.5,
869.0,
791.0
],
"category_id": 0,
"id": 0,
"image_id": 0,
"iscrowd": 0,
"segmentation": [
525.0,
2424.5,
1257.0,
2578.5
]
}
]
]
编辑现在,对于带有两个注释的 JSON:
[
{
"id": 0,
"image_id": 0,
"segmentation": [
[
465.0,
1198.5,
432.0,
1190.5
],
[
525.0,
2424.5,
1257.0,
2578.5
]
],
"iscrowd": 0,
"bbox": [
0.5,
407.5,
869.0,
791.0
],
"area": 425968.25,
"category_id": 0
},
{
"id": 1,
"image_id": 2,
"segmentation": [
[
4241.0,
14.5,
141.0,
7557.5
],
[
578.0,
2424.5,
141.0,
965.5
]
],
"iscrowd": 0,
"bbox": [
0.5,
407.5,
869.0,
791.0
],
"area": 425968.25,
"category_id": 0
}
]
它不会拆分批注,而是复制批注
[
[
{
"id": 0,
"image_id": 0,
"segmentation": [
525.0,
2424.5,
1257.0,
2578.5
],
"iscrowd": 0,
"bbox": [
0.5,
407.5,
869.0,
791.0
],
"area": 425968.25,
"category_id": 0
},
{
"id": 1,
"image_id": 2,
"segmentation": [
[
4241.0,
14.5,
141.0,
7557.5
],
[
578.0,
2424.5,
141.0,
965.5
]
],
"iscrowd": 0,
"bbox": [
0.5,
407.5,
869.0,
791.0
],
"area": 425968.25,
"category_id": 0
}
],
[
{
"id": 0,
"image_id": 0,
"segmentation": [
525.0,
2424.5,
1257.0,
2578.5
],
"iscrowd": 0,
"bbox": [
0.5,
407.5,
869.0,
791.0
],
"area": 425968.25,
"category_id": 0
},
{
"id": 1,
"image_id": 2,
"segmentation": [
[
4241.0,
14.5,
141.0,
7557.5
],
[
578.0,
2424.5,
141.0,
965.5
]
],
"iscrowd": 0,
"bbox": [
0.5,
407.5,
869.0,
791.0
],
"area": 425968.25,
"category_id": 0
}
]
]
评论
0赞
Deamoon
9/5/2022
现在,这适用于示例 JSON。在本例中,segms 是 2 个值、2 个子列表的列表。如果存在多个注释,则仅考虑第一个注释
0赞
Deamoon
9/5/2022
代码应该再缩进一步。
0赞
iamtrappedman
9/5/2022
您已经提供了输出谢谢,现在也请提供您的代码。如果可能的话,存在一个多个注释的示例。
0赞
Deamoon
9/5/2022
添加了代码和示例@iamtrappedman
评论