围棋游戏实现的思路

2023-01-16

如果打算开发一款围棋游戏，该怎么着手去做呢？本文是针对《深度学习与围棋》第三章的python代码进行了思路梳理，是为了加深对作者代码的理解。

把握围棋游戏的主体流程

首先你得懂围棋，最好自己会玩，但无需你是围棋高手，哪怕是刚入门的新手也足够了。整体来说，围棋游戏由两位棋手在19x19（也可以是9x9、13x13）个交叉点网格组成的棋盘上轮流落子，最终围地最多的棋手胜出。我们可以尝试通过伪代码的形式来表示围棋游戏的过程。

# 创建一个棋盘规格为board_size*board_size的新游戏对象
game = new_game(board_size)
# 创建两位棋手（棋手既可以是人也可以是机器人），一方执白棋、另一方执黑棋
players = {
  black: new_player(),
  white: new_player(),
}
while not game.is_over():
  # 打印棋盘
  print_board(game.board)
  # 棋手选择落子动作，game.next_player表示下一回合棋子的颜色，只有black和white两个选项
  move = players[game.next_player].select_move(game)
  # 打印落子的信息，主要是棋手的颜色、落子位置
  print_move(game.next_player, move)
  # 在棋盘上落子
  game = game.applay_move(move)

这部分伪代码虽然十分粗略，它们只是把围棋游戏的过程翻译成了代码的形式而已，根本就没有具体的实现，但是通过这段伪代码我们可以识别出关键的对象和方法。其中new_game用来初始化游戏对象，游戏对象显然需要存储棋盘信息、游戏的状态及历史状态，下回合执子方信息。另外game对象通过is_over方法来判断游戏是否结束，apply_move方法来执行落子动作，因为一旦落子肯定会影响游戏的状态，需要进行处理并返回更新了游戏状态的游戏对象。move表示落子动作，围棋一个回合包含落子、跳过、认输三个选项。print_board、print_move为辅助方法，用来实时刷新围棋棋盘以及棋手的落子信息。

定义基础的数据类型

计算机科学就是建立在抽象的基础之上的，操作系统是对计算机硬件的抽象，它屏蔽了硬件的复杂性和多样性，让用户可以以统一的方式操作各种各样的计算机硬件，抽象降低了复杂度，如今的计算机发展就是建立在一级级地抽象基础之上的，让我们不必从初级的层次开始我们的工作。接下来我们要做的是也是建立抽象，围棋游戏就是和棋子、棋盘打交道，我们需要为棋子、棋盘、棋盘上的交叉点、一次回合的动作等建立抽象。

Player为棋子的抽象，它就是一个枚举类型，让我们可以以易读的方式表示棋子，Player.white表示白棋、Player.black表示黑棋、Player.black.other就是表示白棋

import enum

class Player(enum.Enum):
    black = 1
    white = 2

    @property
    def other(self):
        return Player.black if self == Player.white else Player.white

Point是对棋位的抽象，我们知道棋盘的任意一个棋位可以通过行、列来唯一确定，如p=Point(2,3)表示为C2棋位，方便的是可以通过p.row、p.col来访问行列的值，另外还定义了neighbors方法来方便地得到该棋位相邻的四个棋位。

from collections import namedtuple

# Point对象来指定落子的位置,如Point(2,3)表示落子在第2行、第3列
class Point(namedtuple('Point', 'row col')):
    def neighbors(self):
        return [
            Point(self.row - 1, self.col),
            Point(self.row + 1, self.col),
            Point(self.row, self.col - 1),
            Point(self.row, self.col + 1),
        ]

Move是一个回合动作的抽象，每个回合有落子、跳过、认输三个选择

# Move表示落子动作，只包含落子、跳过、认输三种类型的动作
class Move():
    def __init__(self, point=None, is_pass=False, is_resign=False):
        # ^是异或操作，下面的语句表示三个条件只能有一个为真
        assert (point is not None) ^ is_pass ^ is_resign
        self.point = point
        self.is_play = (self.point is not None)
        self.is_pass = is_pass
        self.is_resign = is_resign
    
    @classmethod
    def play(cls, point):
        # 在棋盘上落子
        return Move(point=point)
    
    @classmethod
    def pass_turn(cls):
        # 跳过回合
        return Move(is_pass=True)

    @classmethod
    def resign(cls):
        # 认输
        return Move(is_resign=True)

棋盘Board类、游戏状态GameState类的实现

接下来就是利用上面的基本类型来实现棋盘类Board、游戏状态类GameState，Board类表示棋盘、负责落子和吃子的逻辑。GameState表示棋局游戏状态，它负责存储棋盘状态、下一回合执子方、上一回合棋盘状态、上一步的动作。可以这么理解，Board对象表达是当前的棋盘快照，而GameState存储了所有回合的棋盘的快照，也就是说GameState表达了整个围棋游戏的过程，它可以实现围棋的复盘。

先来实现Board类，但是开始前我们还需要想清楚该怎么跟踪棋盘上棋子的信息，我们知道每个棋位只有三种可能：空位、黑子、白子，最简单直接的办法当然是每个棋位单独保存独立棋子信息即可，但是直观地感觉这种方式并不是好，因为围棋中最重要的是气，棋子连接起来有两个眼算算作活棋，也就是说棋子之间是有联系的。比较好的方式是跟踪棋链，而不是单独的棋子。棋链是棋盘上相连的一片同色棋子，那么我们就需要建立一个棋链的抽象类GoString，它负责跟踪棋链中的棋子以及棋子的气，还提供增加和减少气数的方法。采用棋链的方式，Board获取每个棋位得到将是这个棋位所在的棋链对象。

# 棋链表示一组相同颜色且相连的棋子, 跟踪维护自身的气数
class GoString():
    def __init__(self, color, stones, liberties):
        self.color = color
        self.stones = set(stones)
        self.liberties = set(liberties)
    
    # 减少气数，当对手在point棋位（跟棋链相邻）落子时，棋链需要将point棋位的气减掉以维护棋链气数的正确
    def remove_liberty(self, point):
        self.liberties.remove(point)

    # 增加气数，当己方吃掉point棋位（跟棋链相邻）的对方棋子时，需要增加该point棋位的气数以维护棋链气数的正确
    def add_liberty(self, point):
        self.liberties.add(point)
    # 合并两个棋链
    def merge_with(self, go_string):
        # 返回新的棋链，包含两条棋链上的所有棋子
        # 首先两条棋链的必须是相同颜色的棋子
        assert go_string.color == self.color
        # 对于set类型来说，|就是求集合的并集
        combines_stones = self.stones | go_string.stones
        return GoString(
            self.color, 
            combines_stones,
            (self.liberties | go_string.liberties) - combines_stones)

    # 获取任意交叉点处的气数，装饰器@property的作用是直接允许通过go_string.num_liberties的方式获取气数，
    # 和go_string.num_liberties()等效，不过经过装饰器修饰后num_liberties更像是对象的一个属性值
    @property
    def num_liberties(self):
        return len(self.liberties)

    # 判断两个棋链是否相同
    def __eq__(self, other):
        return isinstance(other, GoString) and \
            self.color == other.color and \
            self.stones == other.stones and \
            self.liberties == other.liberties

接下来，Board类的实现就顺理成章了，Board类最重要的方法是place_stone，负责落子逻辑的实现，其他都是辅助方法。落子后可能发生双方棋子气数的变化或吃掉对方的子，因此落子需要做合并相邻的棋链、减少对方棋链的气、提走对方气为0的棋链的处理，实现这些使用的都是棋链对象提供的方法。另外Board对象的私有变量_grid存储了整个棋盘的棋链信息，它的key值就是每个棋位的坐标Point(row, col)，从这里也可以看出Board对象就是围棋游戏当前回合的一个快照。

# 棋盘类,棋盘尺寸由行数和列数两个决定
class Board():
    def __init__(self, num_rows, num_cols):
        self.num_rows = num_rows
        self.num_cols = num_cols
        # _grid是私有变量，存储棋盘的状态，它是一个用于存储棋链的字典
        self._grid = {}
    
    # player落子在point棋位
    def place_stone(self, player, point):
        # point必须在棋盘范围内
        assert self.is_on_grid(point)
        # point必须是空棋位才允许落子
        assert self._grid.get(point) is None
        adjacent_same_color = []
        adjacent_opposite_color = []
        liberties = []
        # 遍历落子位置的相邻棋位
        for neighbor in point.neighbors():
            if not self.is_on_grid(neighbor):
                # 如果相邻棋位已经不再棋盘上则继续下一个
                continue
            # 获取此相邻棋位所在的棋链
            neighbor_string = self._grid.get(neighbor)
            if neighbor_string is None:
                # 如果是空棋链，则说明落子后，这个相邻棋位是有气的，
                # 需要加入到新棋链的气中
                liberties.append(neighbor)
            elif neighbor_string.color == player:
                # 如果相邻棋位有棋子且为己方棋子,则该相邻棋位所在的棋链因point落子需要连接起来
                if neighbor_string not in adjacent_same_color:
                    adjacent_same_color.append(neighbor_string)
            else:
                # 否则则表示相邻棋位有棋子且为对方棋子
                if neighbor_string not in adjacent_opposite_color:
                    adjacent_opposite_color.append(neighbor_string)
        # 生成落子位point所在的棋链，初始气数为相邻棋位为空的数量之和
        new_string = GoString(player, [point], liberties)
        # 落子后的三个动作
        # 1. 合并任何相邻同色的棋链
        for same_color_string in adjacent_same_color:
            new_string = new_string.merge_with(same_color_string)
        # 更新棋盘状态
        for new_string_point in new_string.stones:
            self._grid[new_string_point] = new_string
        # 2. 减少对方相邻棋链的气
        for other_color_string in adjacent_opposite_color:
            other_color_string.remove_liberty(point)
        # 3. 提走对方气为0的棋链
        for other_color_string in adjacent_opposite_color:
            if other_color_string.num_liberties == 0:
                self._remove_string(other_color_string)
    
    # 检查落子位是否在棋盘范围内
    def is_on_grid(self, point):
        return 1 <= point.row <= self.num_rows and \
            1 <= point.col <= self.num_cols
        
    # 获取落子位的内容，如果已落子返回对应的Player对象，否则返回None
    def get(self, point):
        # 获取棋位所在的棋链
        string = self._grid.get(point)
        if string is None:
            return None
        return string.color

    # 返回棋位所在的棋链，如果为空返回None
    def get_go_string(self, point):
        string = self._grid.get(point)
        if string is None:
            return None
        return string

    # 提走棋链所在的棋子
    def _remove_string(self, string):
        # 遍历棋链中的所有棋子
        for point in string.stones:
            # 遍历该棋子的所有相邻点
            for neighbor in point.neighbors():
                neighbor_string = self._grid.get(neighbor)
                # 如果相邻棋子所在棋链，则不影响气数，继续下一个相邻点
                if neighbor_string is None:
                    continue
                # 如果相邻棋子所在棋链不是当前棋链，则相邻棋链的气数加1
                if neighbor_string is not string:
                    neighbor_string.add_liberty(point)
            # 更新棋链上的棋子为空
            self._grid[point] = None

最后是GameState类，它核心的功能是保存了每一个回合的棋盘快照，下一个回合的执子方以及上一步的动作。另外，GameState还负责实现禁止自吃规则、劫争规则的实现，这些规则之所以不在Board中实现，是因为Board类是单个回合的棋盘快照，而禁止自吃和劫争需要多个棋盘快照的信息才能确定是否满足规则，比如禁止自吃需要当前回合棋盘和落子后新的棋盘对比才能判断，而劫争规则则需要与历史棋盘比对，单个的棋盘对象显然不具备这个能力。

# 存储围棋游戏的状态
# 游戏状态包括：棋盘棋子的布局、下一个回合的执子方、上一回合的游戏状态、上一步的动作
class GameState():
    def __init__(self, board, next_player, previous, move) -> None:
        self.board = board
        self.next_player = next_player
        self.previous_state = previous
        self.last_move = move
    
    # 执行落子后返回新的GameState对象
    def apply_move(self, move):
        if move.is_play:
            next_board = copy.deepcopy(self.board)
            next_board.place_stone(self.next_player, move.point)
        else:
            next_board = self.board
        return GameState(next_board, self.next_player.other, self, move)
    
    @classmethod
    def new_game(cls, board_size):
        if isinstance(board_size, int):
            board_size = (board_size, board_size)
        board = Board(*board_size)
        return GameState(board, Player.black, None, None)
    
    # 游戏结束的条件：
    # 玩家认输
    # 连续两个回合选择跳过
    def is_over(self):
        if self.last_move is None:
            return False
        if self.last_move.is_resign:
            return True
        second_last_move = self.previous_state.last_move
        if second_last_move is None:
            return False
        return self.last_move.is_pass and second_last_move.is_pass
    
    # 禁止自吃的规则，也就是落子后自己的棋子无气的情况
    def is_move_self_capture(self, player, move):
        if not move.is_play:
            return False
        next_board = copy.deepcopy(self.board)
        next_board.place_stone(player, move.point)
        new_string = next_board.get_go_string(move.point)
        return new_string.num_liberties == 0
    
    @property
    def situation(self):
        return (self.next_player, self.board)
    
    # 劫争规则：旗手的一次落子不能让棋盘状态恢复到上一回合的棋盘状态
    def does_move_violate_ko(self, player, move):
        # 不是落子（跳过或认输）显然不会出现劫争
        if not move.is_play:
            return False
        next_board = copy.deepcopy(self.board)
        next_board.place_stone(player, move.point)
        next_situation = (player.other, next_board)
        past_state = self.previous_state
        # TODO: 为什么要一直往上找历史游戏状态哪？为什么不是判断上上个回合的状态就可以了吗？
        while past_state is not None:
            if past_state.situation == next_situation:
                return True
            past_state = past_state.previous_state
        return False
    
    # 判断落子动作是否合法
    # 1.落子位置必须为空
    # 2.不能自吃
    # 3.符合劫争规则
    def is_valid_move(self, move):
        if self.is_over():
            return False
        if move.is_pass or move.is_resign:
            return True
        return self.board.get(move.point) is None and \
            not self.is_move_self_capture(self.next_player, move) and \
            not self.does_move_violate_ko(self.next_player, move)

开始对弈

至此，一款围棋游戏的基本逻辑就建立好了（棋盘的显示和落子动作的显示等辅助方法因为比较简单就没有提及），最后就是要实现对弈程序了。可以轻松实现人机对弈以及机器人自我对弈，但是这里的机器人是完全随机选择合法的落子动作，水平是很弱的，要实现更智能的对弈机器人需要大量的工作，但是围棋的基本逻辑并没有改变。下面是机器人自我对弈的示例代码，其他如人机对弈也是类似的，自行查看代码即可。

随机落子机器人代码

import random
from dlgo.agent.base import Agent
from dlgo.agent.helpers import is_point_an_eye
from dlgo.goboard_slow import Move
from dlgo.gotypes import Point

class RandomBot(Agent):
    def select_move(self, game_state):
        candidates = []
        # 遍历棋盘棋位，将合法的落子位置加入到candidates
        for r in range(1, game_state.board.num_rows + 1):
            for c in range(1, game_state.board.num_cols + 1):
                candidate = Point(row=r, col=c)
                if game_state.is_valid_move(Move.play(candidate)) and \
                    not is_point_an_eye(game_state.board, 
                                        candidate, 
                                        game_state.next_player):
                    candidates.append(candidate)
        # 可选棋位为空则跳过回合
        if not candidates:
            return Move.pass_turn()
        # 随机选择一个可选的棋位落子
        return Move.play(random.choice(candidates))

机器人自我对弈代码

from dlgo.agent import naive
from dlgo import goboard_slow as goboard
from dlgo import gotypes
from dlgo.utils import print_board, print_move
import time

def main():
    board_size = 9
    game = goboard.GameState.new_game(board_size)
    bots = {
        gotypes.Player.black: naive.RandomBot(),
        gotypes.Player.white: naive.RandomBot(),
    }
    while not game.is_over():
        time.sleep(0.5)
        # 清屏
        print(chr(27) + "[2J")
        # 打印棋盘
        print_board(game.board)
        # 随机选择落子位置
        bot_move = bots[game.next_player].select_move(game)
        # 打印落子动作信息
        print_move(game.next_player, bot_move)
        # 执行落子动作
        game = game.apply_move(bot_move)

if __name__ == "__main__":
    main()

自我对弈终盘结果

 9  x  x  x  x  x  x  x  x  x
 8  x  x  x  x  x  x  x  .  x
 7  x  .  x  x  x  x  .  x  x
 6  x  x  x  .  x  x  x  x  x
 5  x  x  x  x  .  x  x  x  x
 4  x  x  x  .  x  x  .  x  x
 3  x  x  x  x  x  .  x  .  x
 2  x  x  x  .  x  x  .  x  x
 1  x  x  x  x  x  x  x  x  x
    A  B  C  D  E  F  G  H  J
Player.black 跳过

参考文献

《深度学习与围棋》第3章实现第一个围棋机器人，源码链接maxpumperla/deep_learning_and_the_game_of_go: Code and other material for the book “Deep Learning and the Game of Go” (github.com)