取石子游戏
博弈DP: 关键在于如何表示出这个 DP 状态,DP关系 一般是让对方拿最少的分,间接地让自己得最大的分,一般框架: 例如stone game3: dp[i] = max(sum[i] - dp[i+k]) ,其中dp[i] 表示面对[i, n-1] 堆石子,能取得最大的石子数,sum[i]是[i, n-1]区间和,k是本次的选择,拿1堆,2堆还是3堆。
例子 1. 每次取首or尾
[stone game1](https://leetcode-cn.com/problems/stone-game)
class Solution {
public:
//两种策略,取最前or最后,哪一个让自己得分多 (让对方得分少)
// dp[i][j] = sum[i][j] - min(dp[i+1][j], dp[i][j-1])
// 另一种方法,套娃一步,
// dp[i][j] = max(a[i] + min(dp[i+2][j], dp[i+1][j-1]),
// a[j] + min(dp[i+1][j-1], dp[i][j-2]))
// 关键在于怎么想到用dp[i][j]描述状态
bool stoneGame(vector<int>& scores) {
vector<vector<int>> dp(501, vector<int>(501));
int N = scores.size();
scores.push_back(0);
vector<int> sum(N+1);
for (int i = N-1; i >= 0; --i) {
sum[i] = sum[i+1] + scores[i];
dp[i][i] = scores[i];
}
take(0, N-1, sum, dp);
return dp[0][N-1] > sum[0] - dp[0][N-1];
}
int take(int i, int j, const vector<int>& sum, vector<vector<int>>& dp) {
if (i > j) return INT_MAX;
if (!dp[i][j]) {
//compare two strategies
int take_first = min(INT_MAX, take(i+1, j, sum, dp));
int take_last = min(INT_MAX, take(i, j-1, sum, dp));
dp[i][j] = max(sum[i] - sum[j+1] - take_first, sum[i] - sum[j+1] - take_last);
}
return dp[i][j];
}
};
例子 2. 先手取前x个,后手取[1, 2x]个
[stone game2](https://leetcode-cn.com/problems/stone-game-ii/)
开始M=1,对于[0, n-1]个石子,A能取X个, ,A取完,M=max(X, M), B面对的场景是 [x, n-1]个石子,M因为A的选择发生变化。
所以这里最好再 **新增一个维度** 表示用户的上限,这样思路清晰一些。
一个不太好的做法是仿照第一个例子,做了2层套娃,选项变多 使得推理和分析变得困难;增加1个维度,表示更清晰,递推关系更直接。
class Solution {
int N = 0;
public:
//套娃的方法,1维DP,比较难排查bug
int stoneGameII0(vector<int>& piles) {
N = piles.size();
vector<int> dp(N);
return solve(piles, dp, 0, 1);
}
// A win means to make B lose, so they take stones that can minimize the opponent's gains which indirectly maximizes their gains
int solve(vector<int>& a, vector<int>& dp, int i, int M) {
if (i >= N) return 0;
//get all remaining stones
if (i + 2*M >= N) {
int sum = 0;
for (int j = i; j < N; ++j) {
sum += a[j];
}
dp[i] = sum;
return sum;
}
if (dp[i]) return dp[i];
int ms = 0;
for (int x = 1; x <= 2*M; ++x) {
int s = 0; //A get x piles
int j = i;
for (; j < x + i && j < N; ++j) {
s += a[j];
}
// B get 1 ~ 2* max(x, M) to minimize how many A can get in the next round
int b = INT_MAX;
for (int y = 1; y <= 2 * max(x, M); ++y) {
b = min(b, solve(a, dp, j+y, max(y, max(x, M))));
}
s += b;
ms = max(ms, s);
}
dp[i] = ms;
return ms;
}
// easy to understand with 2 round
// dp[i][M] = SUM[i] - min(dp[i+k][max(k, M)]);
// 得用2维的,涉及2个变量i 和 M,一维会覆盖,产生错误
int stoneGameII(vector<int>& piles) {
N = piles.size();
vector<int> sum(N);
sum[N-1] = piles[N-1];
for (int i = N - 2; i>= 0; --i) {
sum[i] = sum[i+1] + piles[i];
}
vector<vector<int>> dp(N, vector<int>(101));
return game(sum, dp, 0, 1);
//for (auto x : dp) cout << x << '\t';
//return dp[0];
}
int game(vector<int>& s, vector<vector<int>>& dp, int i, int M) {
if (i >= N) return 0;
if (i + 2*M >= N) {return s[i];} //take all left
if (dp[i][M]) return dp[i][M];
int ms = 0;
for (int x = 1; x <= 2*M; ++x) {
ms = max(ms, s[i] - game(s, dp, i+x, max(x, M)));
}
dp[i][M] = ms;
return dp[i][M];
}
};
例子3: 先手从3堆里取x堆,看最后谁拿的分数最多
[stone game3](https://leetcode-cn.com/problems/stone-game-iii/)
class Solution {
public:
vector<int> memo;
vector<int> sum;
int INF = 1e9;
//自己最大,就是让对方最小 (最优策略)
//top-down + memo
string stoneGameIII(vector<int>& a) {
//前缀和
int N = a.size();
sum.resize(N);
memo.resize(N);
for (int i = N-1; i >= 0; --i) {
sum[i] = a[i] + (i == N-1 ? 0 : sum[i+1]);
memo[i] = -INF;
}
//debug(sum);
dfs(0);
//debug(memo);
int alice = memo[0], bob = sum[0] - alice;
if (alice > bob) {
return "Alice";
} else if (alice < bob) {
return "Bob";
} else {
return "Tie";
}
}
int dfs(int i) {
if (i >= memo.size()) return 0;
if (memo[i] != -INF) return memo[i];
int res = -INF;
for (int k = 0; k < 3; ++k) {
res = max(res, sum[i] - dfs(i+k+1));
}
memo[i] = res;
return res;
}
};
```
博弈类DP总结
1. 观察选择策略,确定如何表示dp状态,是否需要多维
2. 本着让对方获利最小就是让自己获利最大的原则 建立递推关系
3. 比较简单直接的方法是 top-down 递归+memo
“写状态转移方程很简单,首先要找到所有「状态」和每个状态可以做的「选择」,然后择优。学习算法,一定要注重算法的模板框架,而不是一些看起来牛逼的思路,也不要奢求上来就写一个最优的解法。不要舍不得多用空间,不要过早尝试优化,不要惧怕多维数组。”
> quoted from: [博弈类问题通用解法](https://leetcode-cn.com/problems/stone-game/solution/jie-jue-bo-yi-wen-ti-de-dong-tai-gui-hua-tong-yong/)
Premature optimization is the root of all evil. --Donald Knuth