提问人:darklord84 提问时间:4/14/2021 更新时间:4/14/2021 访问量:586
如何在python中查看correlation方法的完整输出
How to see the full output of the correlation method in python
问:
我正在尝试找到蘑菇数据集中所有列之间的相关性。但是当我在列上运行相关方法时,我得到了一些相关值,但对于许多列,这些值被“...”隐藏。如何查看这些值。
import pandas as pd
import numpy as np
from sklearn import preprocessing
df = pd.read_csv("mushrooms.csv")
print(df.head())
le = preprocessing.LabelEncoder()
for col in df.columns:
df[col] = le.fit_transform(df[col])
df.head()
correlation_df = df.corr()
print(correlation_df)
output
-- 如果在 cap-color 之后看到列数据用 ... 表示。大约有 23 列,但我只能看到大约 8 列的相关数据
class 1.000000 0.052951 0.178446 -0.031384 ... -0.411771 0.171961 0.298686 0.217179
cap-shape 0.052951 1.000000 -0.050454 -0.048203 ... -0.025457 -0.073416 0.063413 -0.042221
cap-surface 0.178446 -0.050454 1.000000 -0.019402 ... -0.106407 0.230364 0.021555 0.163887
cap-color -0.031384 -0.048203 -0.019402 1.000000 ... 0.162513 -0.293523 -0.144770 0.033925
bruises -0.501530 -0.035374 0.070228 -0.000764 ... 0.692973 -0.285008 0.088137 -0.075095
odor -0.093552 -0.021935 0.045233 -0.387121 ... -0.281387 0.469055 -0.043623 -0.026610
gill-attachment 0.129200 0.078865 -0.034180 0.041436 ... -0.146689 -0.029524 0.165575 -0.030304
gill-spacing -0.348387 0.013196 -0.282306 0.144259 ... -0.195897 0.047323 -0.529253 -0.154680
gill-size 0.540024 0.054050 0.208100 -0.169464 ... -0.460872 0.622991 0.147682 0.161418
gill-color -0.530566 -0.006039 -0.161017 0.084659 ... 0.629398 -0.416135 -0.034090 -0.202972
stalk-shape -0.102019 0.063794 -0.014123 -0.456496 ... -0.291444 0.258831 0.087383 -0.269216
stalk-root -0.379361 0.030191 -0.126245 0.321274 ... 0.210155 -0.536996 -0.306747 -0.007668
stalk-surface-above-ring -0.334593 -0.030417 0.089090 -0.060837 ... 0.390091 0.100764 0.079604 -0.058076
stalk-surface-below-ring -0.298801 -0.032591 0.107965 -0.047710 ... 0.394644 0.130974 0.046797 -0.039628
stalk-color-above-ring -0.154003 -0.031659 0.066050 0.002364 ... -0.048878 0.271533 -0.240261 0.042561
stalk-color-below-ring -0.146730 -0.030390 0.068885 0.008057 ... -0.034284 0.254518 -0.242792 0.041594
veil-type NaN NaN NaN NaN ... NaN NaN NaN NaN
veil-color 0.145142 0.072560 -0.016603 0.036130 ... -0.143673 -0.003600 0.124924 -0.040581
ring-number -0.214366 -0.106534 -0.026147 -0.005822 ... 0.058312 0.338417 -0.242020 0.235835
ring-type -0.411771 -0.025457 -0.106407 0.162513 ... 1.000000 -0.487048 0.211763 -0.212080
spore-print-color 0.171961 -0.073416 0.230364 -0.293523 ... -0.487048 1.000000 -0.126859 0.185954
population 0.298686 0.063413 0.021555 -0.144770 ... 0.211763 -0.126859 1.000000 -0.174529
habitat 0.217179 -0.042221 0.163887 0.033925 ... -0.212080 0.185954 -0.174529 1.000000
答:
0赞
Sulphur
4/14/2021
#1
对于部分:“许多列的值被”...“隐藏。我怎样才能看到这些值”
这是因为默认情况下,如果列太多而无法显示,它会隐藏列。我不确定是哪个,并且您的输出图像是否相关,但您需要查看使用 .print(df.head())
df.head()
print(correlation_df)
.iloc
例:
# df is the dataframe with all columns
df_1 = df.iloc[:,0:11] # all rows of column 0-10
df_2 = df.iloc[:,11:21] # all rows for columns 11-20
0赞
Scrapper
4/14/2021
#2
相反,您可以使用“热图”来清晰地了解相关性,您将获得一个用颜色区分的相关性图,以便您可以清楚地理解它。
import seaborn as sns
import matplotlib.pyplot as plt
f,ax=plt.subplots(figsize=(20,20))
sns.heatmap(df.corr(),annot=True)
评论