提问人:Philby_Walsh 提问时间:10/14/2023 更新时间:10/31/2023 访问量:44
通过运动员矩阵 + 概率进行迭代
Iteration through matrix of sports players + probabilities
问:
目标
我想填充一个网格,该网格将包含一名球员在足球(“足球”)比赛中获得黄牌的概率,具体取决于 3 个因素
player_rate:球员累积黄牌的平均比率(例如,值为0.5表示他平均每场比赛会累积半张黄牌。或者实际上,每 2 场比赛一次)
team_names:高于或低于平均费率的金额,一支球队导致对手球员被黄牌警告(例如,值 0.10 表示您在比赛时被黄牌警告的可能性比平均水平高 10%,而 -0.10 表示您在比赛时被黄牌警告的可能性比平均水平低 10%)
opponent_player_rates:高于或低于平均比率的金额,与该直接对手对战的球员累积黄牌的金额(例如,值为 0.05 表示您在与该特定对手对战时比与“平均”对手对战时被黄牌警告的可能性比平均水平高 5%)
我试图以此作为以下公式的基础,在考虑上述 3 个变量后,计算每个球员被黄牌警告的可能性
adjusted_likelihood = player_rate * (1 + team_rate + opponent_player_rate)
使用下面的代码,我通过构建一个数据帧来解决这个问题,该数据帧将 449 名玩家中的每一个相互匹配(例如,为“玩家”取一行并读取。每一列都给出了该球员在与特定列中列出的球员比赛时被黄牌警告的概率)
问题
当我运行这段代码时,它只计算和填充第一行的值(即 index = 0, “player” = “Max Aarons”)。但我希望它能计算出完整的值矩阵,这样我就可以判断任何球员在与任何特定对手球员对抗时被黄牌警告的可能性。
注意:第 1 行上的一些 NaN 值是预期的(例如,对于列 = “Bénie Adama Traore”,因为我尚未为其创建“team_rates”的团队子集)。
代码:重新生成输出
import pandas as pd
from itertools import product
def calculate_adjusted_likelihood(player_rates, team_rates, opponent_player_rates):
adjusted_likelihoods = []
for player_rate, team_rate, opponent_player_rate in zip(player_rates, team_rates, opponent_player_rates):
print(f"Debug: Player Rate ({player_name}): {player_rate}, Team Rate: {team_rate}, Opponent Player Rate: {opponent_player_rate}")
adjusted_likelihood = player_rate * (1 + team_rate + opponent_player_rate)
adjusted_likelihoods.append(adjusted_likelihood)
print(f"Debug: Adjusted Likelihood ({player_name}): {adjusted_likelihood}")
return adjusted_likelihoods
player_names = clubs_slim['player'].tolist()
player_rates = clubs_slim['Yellows per 90'].tolist()
# Load - AGAINST EACH PLAYER ROW - team names & average booking rates (for teams against them)
team_names = clubs_slim['squad'].tolist()
team_rates = clubs_slim['CrdY Against Avg relative'].tolist()
# construct the dataframe
df_likelihoods = pd.DataFrame(index=player_names, columns=player_names)
# Create list of opponent players % likelihood to cause a yellow
opponent_player_rates = clubs_slim['Yellow Threat Scaled'].tolist()
print("Length of player_names:", len(player_names))
print("Length of player_rates:", len(player_rates))
print("Length of team_names:", len(team_names))
print("Length of team_rates:", len(team_rates))
print("Length of opponent_player_rates:", len(opponent_player_rates))
# Call method to calculate adjusted likelihoods of being booked
adjusted_likelihoods = calculate_adjusted_likelihood(player_rates, team_rates, opponent_player_rates)
# Debugging print
print("Number of player names:", len(player_names))
print("Number of adjusted likelihoods:", len(adjusted_likelihoods))
print("Number of opponent_player_rates:", len(opponent_player_rates))
print("Adjusted Likelihoods:", adjusted_likelihoods)
# Populate DataFrame with adjusted likelihoods
for (player_name, opponent_name), likelihood in zip(product(player_names, repeat=2), adjusted_likelihoods):
print(f"Player Name: {player_name}, Opponent Name: {opponent_name}")
df_likelihoods.at[player_name, opponent_name] = likelihood
# Debugging print
print("Number of columns in df_likelihoods:", len(df_likelihoods.columns))
# Merge the columns into df_likelihoods
merged_df = clubs_slim[['player', 'squad', 'position', '90s', 'Yellows per 90', 'Yellow Threat']].merge(
df_likelihoods.reset_index(),
left_on='player',
right_on='index'
)
merged_df.drop(['index'], axis=1, inplace=True)
# Debugging print
print("Number of columns in merged_df:", len(merged_df.columns))
merged_df.to_csv('booking_likelihood_dump.csv')
# Display the result
merged_df
数据帧的图像
数据帧:merged_df
数据帧:clubs_slim
数据帧:PL_2223_misc_against
答:
这就是我最终让它工作的方式,使用 enumerate 和 nested for 循环
def calculate_adjusted_likelihood(player_rates, team_rates, opponent_player_rates):
adjusted_likelihoods = []
for i, player_rate in enumerate(player_rates):
for j, team_rate in enumerate(team_rates):
opponent_player_rate = opponent_player_rates[i % len(opponent_player_rates)] # Use modulo for opponent player rate
adjusted_likelihood = player_rate * (1 + team_rate + opponent_player_rate)
adjusted_likelihoods.append(adjusted_likelihood)
return adjusted_likelihoods
评论
j
itertools.cycle()
zip()
i
%
return [player_rate * (1 + team_rate + opponent_player_rate) for player_rate, opponent_player_rate in zip(player_rates, cycle(opponent_player_rates)) for team_rate in team_rates]
评论