提问人:kheder47 提问时间:2/3/2023 最后编辑:kheder47 更新时间:2/3/2023 访问量:44
如果这些起始值稍后被替换为最高和最低分数,为什么会将 -math.inf 和 math.inf 添加到此 minimax 算法中?
Why is -math.inf and math.inf added to this minimax algorithm, if these starting values be replaced with the highest and lowest score later?
问:
正如你所看到的,这是一个无与伦比的AI TicTacToe游戏的代码(game.py 是主文件):
game.py
import math
import random
class Player():
def __init__(self, letter):
self.letter = letter
def get_move(self, game):
pass
class HumanPlayer(Player):
def __init__(self, letter):
super().__init__(letter)
def get_move(self, game):
valid_square = False
val = None
while not valid_square:
square = input(self.letter + '\'s turn. Input move (0-9): ')
try:
val = int(square)
if val not in game.available_moves():
raise ValueError
valid_square = True
except ValueError:
print('Invalid square. Try again.')
return val
class RandomComputerPlayer(Player):
def __init__(self, letter):
super().__init__(letter)
def get_move(self, game):
square = random.choice(game.available_moves())
return square
class SmartComputerPlayer(Player):
def __init__(self, letter):
super().__init__(letter)
def get_move(self, game):
if len(game.available_moves()) == 9:
square = random.choice(game.available_moves())
else:
square = self.minimax(game, self.letter)['position']
return square
def minimax(self, state, player):
max_player = self.letter # yourself
other_player = 'O' if player == 'X' else 'X'
# first we want to check if the previous move is a winner
if state.current_winner == other_player:
return {'position': None, 'score': 1 * (state.num_empty_squares() + 1) if other_player == max_player else -1 * (
state.num_empty_squares() + 1)}
elif not state.empty_squares():
return {'position': None, 'score': 0}
if player == max_player:
best = {'position': None, 'score': -math.inf} # each score should maximize
else:
best = {'position': None, 'score': math.inf} # each score should minimize
for possible_move in state.available_moves():
state.make_move(possible_move, player)
sim_score = self.minimax(state, other_player) # simulate a game after making that move
# undo move
state.board[possible_move] = ' '
state.current_winner = None
sim_score['position'] = possible_move # this represents the move optimal next move
if player == max_player: # X is max player
if sim_score['score'] > best['score']:
best = sim_score
else:
if sim_score['score'] < best['score']:
best = sim_score
return best
player.py:
import math
import random
class Player():
def __init__(self, letter):
self.letter = letter
def get_move(self, game):
pass
class HumanPlayer(Player):
def __init__(self, letter):
super().__init__(letter)
def get_move(self, game):
valid_square = False
val = None
while not valid_square:
square = input(self.letter + '\'s turn. Input move (0-9): ')
try:
val = int(square)
if val not in game.available_moves():
raise ValueError
valid_square = True
except ValueError:
print('Invalid square. Try again.')
return val
class RandomComputerPlayer(Player):
def __init__(self, letter):
super().__init__(letter)
def get_move(self, game):
square = random.choice(game.available_moves())
return square
class SmartComputerPlayer(Player):
def __init__(self, letter):
super().__init__(letter)
def get_move(self, game):
if len(game.available_moves()) == 9:
square = random.choice(game.available_moves())
else:
square = self.minimax(game, self.letter)['position']
return square
def minimax(self, state, player):
max_player = self.letter # yourself
other_player = 'O' if player == 'X' else 'X'
# first we want to check if the previous move is a winner
if state.current_winner == other_player:
return {'position': None, 'score': 1 * (state.num_empty_squares() + 1) if other_player == max_player else -1 * (
state.num_empty_squares() + 1)}
elif not state.empty_squares():
return {'position': None, 'score': 0}
if player == max_player:
best = {'position': None, 'score': -math.inf} # each score should maximize
else:
best = {'position': None, 'score': math.inf} # each score should minimize
for possible_move in state.available_moves():
state.make_move(possible_move, player)
sim_score = self.minimax(state, other_player) # simulate a game after making that move
# undo move
state.board[possible_move] = ' '
state.current_winner = None
sim_score['position'] = possible_move # this represents the move optimal next move
if player == max_player: # X is max player
if sim_score['score'] > best['score']:
best = sim_score
else:
if sim_score['score'] < best['score']:
best = sim_score
return best
我知道,如果玩家是最大化的玩家,那么你就从一个负无穷大的分数开始,然后寻找一个更好的分数。否则,您从正分数开始,然后寻找最差的分数。一个玩家试图最小化分数,另一个玩家试图最大化分数。 但是经过无数小时的研究,我仍然不知道为什么 -math.inf 和 math.inf 会添加到这个 minimax 算法中,如果这些起始值稍后被替换为最高和最低分数?
如果你能为傻瓜解释它(尽可能简单),你会帮我一个很大的忙,因为我是一个初学者:)
PS:我指的是这个代码片段:
if player == max_player:
best = {'position': None, 'score': -math.inf}
else:
best = {'position': None, 'score': math.inf}
答:
0赞
tdelaney
2/3/2023
#1
有没有更好的起始值?假设您开始时没有任何分数,或将其设置为“无”。那么你必须在所有的比较中都有一个特例。使用 inf 是为了让算法始终有效,即使在第一步也是如此。
0赞
user1196549
2/3/2023
#2
可能有两个原因:
您必须使用不大于最大值的值进行初始化,您不知道该值;
事后,您可以检测到未处理任何元素的情况(并且值仍为 -inf)。
此构造通常比将最大值设置为第一个元素的值的替代方法更可取,因为这会延长代码(并且在较小程度上使 2 不可能)。
评论
+inf
-inf
if sim_score['score'] > best['score']:
best['score']
sim_score['score']
best['score']
True
+inf
-inf
if sim_score['score'] > best['score']:
if there is no value for best['score'] yet, or if sim_score['score'] > best['score']: