提问人:Hemant Kumar 提问时间:5/26/2019 更新时间:5/26/2019 访问量:2920
在 Pandas 中将自定义标题添加到数据框并将其转换为 HTML
Add custom title to a data-frame in Pandas and convert it to HTML
问:
我正在从目录列表中读取某些 csv 文件,即 actual_results 和 expected_results。现在,我用actual_results浏览每个 csv,并在 expected_results 中比较 csv。然后我想将整个数据显示为 HTML 格式,如下所示
我已经编写了一些代码来实际清理数据,然后比较实际和预期 csv 的数据帧。
以下是整个代码:
import pandas as pd
import sys
from glob import glob
import os
import itertools
# compareCSV takes in two args as path of the two csv files to compare
def compare(expectedList,actualList):
ctr=0
dfList = list()
for (csv1,csv2) in itertools.zip_longest(expectedList,actualList):
df1_ctr=pd.read_csv(csv1,sep=',')
df1_ctr[df1_ctr.columns[1:]] = [x.split('\t') for x in df1_ctr['mean(ms)']]
df1=df1_ctr.apply(pd.to_numeric,errors='coerce')
df2_ctr=pd.read_csv(csv2,sep=',')
df2_ctr[df2_ctr.columns[1:]] = [x.split('\t') for x in df2_ctr['mean(ms)']]
df2=df2_ctr.apply(pd.to_numeric,errors='coerce')
print("Dataframe for Expected List for file : {} is \n {}".format(csv1,df1))
print("Dataframe for Actual List for file: {} is \n {}".format(csv2,df2))
d3=df1.loc[:,:] # Dataframe 1
d4=df2.loc[:,:] # Dataframe 2
d5=abs(((d3.subtract(d4))/d3)*100)
print("Deviation between file {} and {} is :\n {}".format(csv1,csv2,d5))
ctr=ctr+1
#Final Data frame
df=pd.concat([df1,df2,d5])
#print("{}".format(df))
dfList.append(df)
#print("Final Data frame: \n{}".format(dfList))
# for data in dfList:
# print("data at index: \n{}".format(data))
if __name__ == "__main__":
#file1=sys.argv[1] # FileName1
#file2=sys.argv[2] #FileName2
#compareCSV(file1,file2) # Compare CSV files passed in as paramters
os.chdir("expected_results")
expectedCSVs=glob("*.csv")
#print(expectedCSVs)
os.chdir("../actual_results")
actualCSVs=glob("*.csv")
#print(actualCSVs)
compare(expectedCSVs,actualCSVs)
我目前有一些多余的打印声明。 上述代码的输出如下:
Dataframe for Expected List for file : CT_QRW_25.csv is
100%Q mean(ms) P50(ms) P99(ms) p99.9(ms) #Samples
0 NaN 0.038973 0.044939 0.091076 0.363859 1760108
1 NaN 0.050652 0.044963 0.094738 0.402525 1354233
2 NaN 0.046500 0.045020 0.108138 0.320636 123448
3 NaN 1.872630 0.599966 33.313200 172.040000 21954617
4 NaN 37.752900 0.600484 603.063000 805.340000 2708258
Dataframe for Actual List for file: CT_QRW_25.csv is
100%Q mean(ms) P50(ms) P99(ms) p99.9(ms) #Samples
0 NaN 0.038973 0.044939 0.091076 0.363859 1760108
1 NaN 0.050652 0.044963 0.094738 0.402525 1354233
2 NaN 0.046500 0.045020 0.108138 0.320636 123448
3 NaN 1.872630 0.599966 33.313200 172.040000 21954617
4 NaN 37.752900 0.600484 603.063000 805.340000 2708258
Deviation between file CT_QRW_25.csv and CT_QRW_25.csv is :
100%Q mean(ms) P50(ms) P99(ms) p99.9(ms) #Samples
0 NaN 0.0 0.0 0.0 0.0 0.0
1 NaN 0.0 0.0 0.0 0.0 0.0
2 NaN 0.0 0.0 0.0 0.0 0.0
3 NaN 0.0 0.0 0.0 0.0 0.0
4 NaN 0.0 0.0 0.0 0.0 0.0
Dataframe for Expected List for file : CT_W_14.csv is
100%Q mean(ms) P50(ms) P99(ms) p99.9(ms) #Samples
0 NaN NaN NaN NaN NaN NaN
1 NaN NaN NaN NaN NaN NaN
2 NaN NaN NaN NaN NaN NaN
3 NaN NaN NaN NaN NaN NaN
4 NaN 97.8025 17.8492 725.619 891.455 5304765.0
Dataframe for Actual List for file: CT_W_14.csv is
100%Q mean(ms) P50(ms) P99(ms) p99.9(ms) #Samples
0 NaN NaN NaN NaN NaN NaN
1 NaN NaN NaN NaN NaN NaN
2 NaN NaN NaN NaN NaN NaN
3 NaN NaN NaN NaN NaN NaN
4 NaN 97.8025 17.8492 725.619 891.455 5304765.0
Deviation between file CT_W_14.csv and CT_W_14.csv is :
100%Q mean(ms) P50(ms) P99(ms) p99.9(ms) #Samples
0 NaN NaN NaN NaN NaN NaN
1 NaN NaN NaN NaN NaN NaN
2 NaN NaN NaN NaN NaN NaN
3 NaN NaN NaN NaN NaN NaN
4 NaN 0.0 0.0 0.0 0.0 0.0
目标: 由于我目前拥有的语句是打印语句,因此如果我想将其转换为 HTML,我将无法使其动态化。我的目标是将其输出到 HTML 文件中。或者,即使有一种自定义方法可以在数据框中添加行作为标题,也可以。此外,如果偏差大于 10%,那么我想以红色显示单元格。如果有人遇到过这种情况,那就太好了,请帮帮我。任何帮助将不胜感激。
答:
3赞
pnovotnyq
5/26/2019
#1
Pandas 有一个特殊的样式对象,可以使用其方法导出为 HTML,也可以使用 excel 导出为 。您可以使用 CSS 来设置表格格式并添加标题,如下所示:.render
.to_excel
def highlight_high(series, threshold, colour):
return ['background-color:'+ colour.lower() if threshold <= i else 'background-color: white' for i in series]
# df.style.apply creates a pandas.io.formats.style.Styler object from a DataFrame
highlighted = df.style.apply(highlight_high, axis=0, subset=pd.IndexSlice[:,'P50(ms)'], colour = 'red', threshold = 0.5)
# adding a caption
highlighted = highlighted.set_caption('Highlighted P50')
# render() generates the HTML for the Styler object
with open('table.html', 'w') as f:
f.write(highlighted.render())
我不确定该给什么颜色,所以我选择了你的. 用于 Series/DataFrame 样式和元素。样式函数的输入和输出形状必须匹配。Dataframe for Actual List
Styler.apply
Styler.applymap
- 使用
subset
pd.IndexSlice
- 设置阈值
threshold
- 选择 HTML 颜色
colour
- 添加标题
.set_caption
- 导出为 HTML 或
.render
.to_excel
我的结果:
评论