如果这些起始值稍后被替换为最高和最低分数，为什么会将 -math.inf 和 math.inf 添加到此 minimax 算法中？-解网

问：

正如你所看到的，这是一个无与伦比的AI TicTacToe游戏的代码（game.py 是主文件）：

game.py

import math
import random


class Player():
    def __init__(self, letter):
        self.letter = letter

    def get_move(self, game):
        pass


class HumanPlayer(Player):
    def __init__(self, letter):
        super().__init__(letter)

    def get_move(self, game):
        valid_square = False
        val = None
        while not valid_square:
            square = input(self.letter + '\'s turn. Input move (0-9): ')
            try:
                val = int(square)
                if val not in game.available_moves():
                    raise ValueError
                valid_square = True
            except ValueError:
                print('Invalid square. Try again.')
        return val


class RandomComputerPlayer(Player):
    def __init__(self, letter):
        super().__init__(letter)

    def get_move(self, game):
        square = random.choice(game.available_moves())
        return square


class SmartComputerPlayer(Player):
    def __init__(self, letter):
        super().__init__(letter)

    def get_move(self, game):
        if len(game.available_moves()) == 9:
            square = random.choice(game.available_moves())
        else:
            square = self.minimax(game, self.letter)['position']
        return square

    def minimax(self, state, player):
        max_player = self.letter  # yourself
        other_player = 'O' if player == 'X' else 'X'

        # first we want to check if the previous move is a winner
        if state.current_winner == other_player:
            return {'position': None, 'score': 1 * (state.num_empty_squares() + 1) if other_player == max_player else -1 * (
                        state.num_empty_squares() + 1)}
        elif not state.empty_squares():
            return {'position': None, 'score': 0}

        if player == max_player:
            best = {'position': None, 'score': -math.inf}  # each score should maximize
        else:
            best = {'position': None, 'score': math.inf}  # each score should minimize
        for possible_move in state.available_moves():
            state.make_move(possible_move, player)
            sim_score = self.minimax(state, other_player)  # simulate a game after making that move

            # undo move
            state.board[possible_move] = ' '
            state.current_winner = None
            sim_score['position'] = possible_move  # this represents the move optimal next move


            if player == max_player:  # X is max player
                if sim_score['score'] > best['score']:
                    best = sim_score
            else:
                if sim_score['score'] < best['score']:
                    best = sim_score
        return best

player.py：

import math
import random


class Player():
    def __init__(self, letter):
        self.letter = letter

    def get_move(self, game):
        pass


class HumanPlayer(Player):
    def __init__(self, letter):
        super().__init__(letter)

    def get_move(self, game):
        valid_square = False
        val = None
        while not valid_square:
            square = input(self.letter + '\'s turn. Input move (0-9): ')
            try:
                val = int(square)
                if val not in game.available_moves():
                    raise ValueError
                valid_square = True
            except ValueError:
                print('Invalid square. Try again.')
        return val


class RandomComputerPlayer(Player):
    def __init__(self, letter):
        super().__init__(letter)

    def get_move(self, game):
        square = random.choice(game.available_moves())
        return square


class SmartComputerPlayer(Player):
    def __init__(self, letter):
        super().__init__(letter)

    def get_move(self, game):
        if len(game.available_moves()) == 9:
            square = random.choice(game.available_moves())
        else:
            square = self.minimax(game, self.letter)['position']
        return square

    def minimax(self, state, player):
        max_player = self.letter  # yourself
        other_player = 'O' if player == 'X' else 'X'

        # first we want to check if the previous move is a winner
        if state.current_winner == other_player:
            return {'position': None, 'score': 1 * (state.num_empty_squares() + 1) if other_player == max_player else -1 * (
                        state.num_empty_squares() + 1)}
        elif not state.empty_squares():
            return {'position': None, 'score': 0}

        if player == max_player:
            best = {'position': None, 'score': -math.inf}  # each score should maximize
        else:
            best = {'position': None, 'score': math.inf}  # each score should minimize
        for possible_move in state.available_moves():
            state.make_move(possible_move, player)
            sim_score = self.minimax(state, other_player)  # simulate a game after making that move

            # undo move
            state.board[possible_move] = ' '
            state.current_winner = None
            sim_score['position'] = possible_move  # this represents the move optimal next move

            if player == max_player:  # X is max player
                if sim_score['score'] > best['score']:
                    best = sim_score
            else:
                if sim_score['score'] < best['score']:
                    best = sim_score
        return best

我知道，如果玩家是最大化的玩家，那么你就从一个负无穷大的分数开始，然后寻找一个更好的分数。否则，您从正分数开始，然后寻找最差的分数。一个玩家试图最小化分数，另一个玩家试图最大化分数。但是经过无数小时的研究，我仍然不知道为什么 -math.inf 和 math.inf 会添加到这个 minimax 算法中，如果这些起始值稍后被替换为最高和最低分数？

如果你能为傻瓜解释它（尽可能简单），你会帮我一个很大的忙，因为我是一个初学者:)

PS：我指的是这个代码片段：

if player == max_player:
            best = {'position': None, 'score': -math.inf}  
        else:
            best = {'position': None, 'score': math.inf}

Python 算法字 Minimax Infinity

您需要使用默认值进行初始化。在这一点上，你不知道最高分是多少，最低分是多少，因为还没有进行任何计算。所以我们只是把 or 作为替身。唯一重要的是，对于玩家来说，来自真实游戏序列的任何实际值都比默认值更好（因为默认值实际上并不对应于实际的游戏序列）。+inf-inf

2赞 Karl Knechtel 2/3/2023

欢迎使用 Stack Overflow。提示：代码说的地方，如果还没有定义，你认为会发生什么？现在，考虑一下 - 第一次比较时，结果应该始终是更新，对吧？所以。你能想出一个值，这会导致比较总是导致？你明白这与你问的问题有什么关系吗？if sim_score['score'] > best['score']:best['score']sim_score['score']best['score']True

0赞 Karl Knechtel 2/3/2023

提示：如果你想知道为什么事情会变成这样，试着改变它，看看会发生什么。

0赞 Karl Knechtel 2/3/2023

提示：首先，你怎么知道有最小值算法这样的东西？我想这是因为你在网页上、教科书或教师的笔记等上读到过它——对吧？那么 - 您是否尝试在该来源中阅读更多内容，看看它是否解释了您提出的问题？

0赞 Stef 2/3/2023

请注意，可以在不使用默认值和不使用和的情况下编写此代码。但这需要一点小心，例如，你必须写类似的东西，而不是+inf-infif sim_score['score'] > best['score']:if there is no value for best['score'] yet, or if sim_score['score'] > best['score']:

答：

0赞 tdelaney 2/3/2023 #1

有没有更好的起始值？假设您开始时没有任何分数，或将其设置为“无”。那么你必须在所有的比较中都有一个特例。使用 inf 是为了让算法始终有效，即使在第一步也是如此。

0赞 user1196549 2/3/2023 #2

可能有两个原因：

您必须使用不大于最大值的值进行初始化，您不知道该值;
事后，您可以检测到未处理任何元素的情况（并且值仍为 -inf）。

此构造通常比将最大值设置为第一个元素的值的替代方法更可取，因为这会延长代码（并且在较小程度上使 2 不可能）。

上一个：Python 中的 math.inf 和 cmath.inf 有区别吗？

下一个：For循环返回错误结果

如果这些起始值稍后被替换为最高和最低分数，为什么会将 -math.inf 和 math.inf 添加到此 minimax 算法中？

Why is -math.inf and math.inf added to this minimax algorithm, if these starting values be replaced with the highest and lowest score later?

评论