提问人:Chris 提问时间:11/1/2023 最后编辑:mkrieger1Chris 更新时间:11/2/2023 访问量:65
LogisticRegression 完全准分离
LogisticRegression complete quasi-separation
问:
为了计算倾向分数,我想估计横截面二元响应回归模型 使用 statsmodel LogisticRegression 逐年计算。作为解释变量,我考虑了公司特征,治疗组说明了是否在样本中。估计结果令人困惑,表明可能完全准分离。
如何解决模型评估指标不佳和完全准分离的问题?
df = pd.read_excel("Posthoc/PSM_firms_combined.xlsx")
X_year = df[df['Year'] == 2015][['Total Assets', 'Growth', 'Price to Book Value per Share', 'Total Debt to Common Equity']]
y_year = df[df['Year'] == 2015]['Treatment']
logit_model = sm.Logit(y_year, X_year)
results = logit_model.fit()
print(results.summary())
# Obtain the chi-squared statistic
chi_squared = results.llr
print("Chi-Squared Statistic:", chi_squared)
# Calculate McFadden's R-squared
log_likelihood_model = results.llf # Log-likelihood of the model
log_likelihood_null = results.llnull # Log-likelihood of a null model
mcfadden_r2 = 1 - (log_likelihood_model / log_likelihood_null)
print("McFadden's R-squared:", mcfadden_r2)
# Obtain AUC-ROC
y_pred_prob = results.predict(X_year)
auc_roc = roc_auc_score(y_year, y_pred_prob)
print("AUC-ROC:", auc_roc)
我尝试了没有成功和 FirthLogisticRegression。logit_model.fit(method = 'bfgs')
答: 暂无答案
评论