在这种情况下，如何优化 json 到数据帧的预处理-解网

问：

为了如上所述将每个数据从 json 更改为 dataframe，我编写了如下代码：

import json
import pandas as pd
import time
with open('color.json', 'r') as f:
    json_color = json.load(f)

df=pd.DataFrame(json_color)
start = time.time()
new_df=pd.DataFrame()
index_list=[]
for i in range(0,len(df)):
    for j,(key,value) in enumerate(df['result'][i]['RGB'].items()):
        key=key.split(',')
        df2 = pd.DataFrame(data=[[key[0],key[1],key[2],value]])
        index_list.append(df['result'][i]['name'][:-4]+'_'+str(j))
        new_df=pd.concat([new_df,df2])
new_df.index=index_list
new_df.columns=[['R','G','B','percentile']]
print(new_df)
print(time.time()-start)

它确实有效，但我希望我能更有效地更改这些代码。这里有什么建议吗？

Python JSON Pandas 数据帧

>>> df
                                R    G    B  percentile
vogue_S_21_fashion-east_56_0   23   23   23       81.73
vogue_S_21_fashion-east_56_1  104  108  109       17.11
vogue_S_21_fashion-east_56_2  223  228  233        1.01
vogue_S_21_fashion-east_56_3  142  134   82        0.15

>>> json_color
{'name': 'vogue_S_21_fashion-east_56.jpg',
 'RGB': {'23,23,23': 81.73,
  '104,108,109': 17.11,
  '223,228,233': 1.01,
  '142,134,82': 0.15}}

但是，如果您看起来像：json_color

>>> json_color
{'result': [{'name': 'vogue_S_21_fashion-east_56.jpg',
   'RGB': {'23,23,23': 81.73,
    '104,108,109': 17.11,
    '223,228,233': 1.01,
    '142,134,82': 0.15}}]}

您可以使用：

data = {}
for record in json_color['result']:
    name = record['name'].rsplit('.', maxsplit=1)[0]
    for j, (vals, P) in enumerate(record['RGB'].items()):
        R, G, B = vals.split(',')
        data[f'{name}_{j}'] = [int(R), int(G), int(B), P]

df = pd.DataFrame.from_dict(data, columns=['R', 'G', 'B', 'percentile'], orient='index')

在这种情况下，如何优化 json 到数据帧的预处理

How to optimize pre-processing json to dataframes for this case

评论

评论