如何计算均值差值的 t 检验来评估哪种算法获得更高的 F1 分数？-解网

问：

我正在做一个项目，我的预期结果是哪个分类器在 F1 分数的基础上表现得很好。我正在对均值差异进行 t 检验，以评估哪种算法获得更高的 F1 分数。

我有两个分类器的 F1 分数和alog_A: 0.589744algo_B: 0.641026

以下是我用来满足我的项目要求的代码，但从这段代码中，我得到了它向我展示的任何 NaN。如何解决此问题？

from scipy import stats
t_value,p_value=stats.ttest_ind(f1_score_Algo_A,f1_score_Algo_B)
print('Test statistic is %f'%float("{:.6f}".format(t_value)))
print('p-value for two tailed test is %f'%p_value)

我得到以下输出

Test statistic is nan
p-value for two tailed test is nan

我的预期结果是哪种算法在 t 检验差异值和p_value方面表现良好。

Python 机器学习 scipy 统计假设检验

from sklearn.metrics import f1_score
from scipy import stats

from sklearn.linear_model import LogisticRegression
from sklearn.tree import DecisionTreeClassifier
from mlxtend.data import iris_data
from sklearn.model_selection import train_test_split

algo_A = LogisticRegression(random_state=1, max_iter=1000)  # try your algos / models
algo_B = DecisionTreeClassifier(random_state=1, max_depth=3)

X, y = iris_data() # try your dataset

f1_scores_Algo_A, f1_scores_Algo_B = [], []

for i in range(100):
    X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.25)

    y_pred = algo_A.fit(X_train, y_train).predict(X_test)
    f1_scores_Algo_A.append(f1_score(y_test, y_pred, average='micro'))

    y_pred = algo_B.fit(X_train, y_train).predict(X_test)
    f1_scores_Algo_B.append(f1_score(y_test, y_pred, average='micro'))

下图显示了使用模型的不同训练测试拆分获得的 F1 分数的分布。

现在，我们可以进行配对：t-test

t_value,p_value=stats.ttest_ind(f1_scores_Algo_A, f1_scores_Algo_B)
print('Test statistic is %f'%float("{:.6f}".format(t_value)))
# Test statistic is 2.457321
print('p-value for two tailed test is %f'%p_value)
# p-value for two tailed test is 0.014858

这样我们就可以在 5% 的显著性水平上拒绝原假设（2 个独立样本具有相同的平均分数）。

如何计算均值差值的 t 检验来评估哪种算法获得更高的 F1 分数？

How to calculate t-test for the difference of means to assess which algorithm achieves higher F1 score?

评论

评论