1 00:00:00,089 --> 00:00:02,770 we will not speak about maximum 我们不会谈论最大 2 00:00:02,970 --> 00:00:05,919 strategies these make particular sense 这些策略特别有意义 3 00:00:06,120 --> 00:00:08,620 in the context of zero-sum games but 在零和游戏的背景下, 4 00:00:08,820 --> 00:00:12,100 actually are pretty look quite - to all 实际上看起来很漂亮-所有人 5 00:00:12,300 --> 00:00:15,580 games what is the max min strategy it 游戏的最大最小策略是什么 6 00:00:15,779 --> 00:00:19,239 simply puts a player's strategy that 简单地提出一个球员的策略, 7 00:00:19,439 --> 00:00:21,550 maximizes their payoff assuming the 假设 8 00:00:21,750 --> 00:00:26,970 other player is out to get them we will 另一个玩家要得到他们,我们将 9 00:00:27,170 --> 00:00:29,499 we will concentrate primarily on the 我们将主要集中于 10 00:00:29,699 --> 00:00:32,829 true player case here again because when 真正的玩家案例再次出现,因为 11 00:00:33,030 --> 00:00:34,570 we get to zero-sum games they don't 我们去了零和游戏 12 00:00:34,770 --> 00:00:36,608 really make only center the case of two 确实只使两个情况居中 13 00:00:36,808 --> 00:00:40,299 players but keep in mind that one could 玩家,但请记住, 14 00:00:40,500 --> 00:00:42,729 define this kind of more generally when 更一般地定义这种情况 15 00:00:42,929 --> 00:00:46,768 we speak about maximum strategy so the 我们谈论最大策略,因此 16 00:00:46,969 --> 00:00:49,409 maximum strategy is a strategy that 最大策略是 17 00:00:49,609 --> 00:00:54,070 maximizes my worst case outcome and my 最大化我的最坏情况的结果, 18 00:00:54,270 --> 00:00:58,119 maximum value or safety level is that 最大值或安全级别是 19 00:00:58,320 --> 00:01:00,338 payoff that's guaranteed by the maximum 最高保证的回报 20 00:01:00,539 --> 00:01:03,849 strategy and here it is defined formally 策略,在这里正式定义 21 00:01:04,049 --> 00:01:07,988 the maximum strategy for player I is the 我的最大策略是 22 00:01:08,188 --> 00:01:13,259 strategy Aswan that maximizes the 最大化阿斯旺战略 23 00:01:13,459 --> 00:01:16,450 minimum that the other player remember 另一个玩家记住的最低限度 24 00:01:16,650 --> 00:01:22,709 the - I is the player other than I would -我是我以外的球员 25 00:01:22,909 --> 00:01:28,778 hold play 1 down - and the maximum value 按住播放1-并保持最大值 26 00:01:28,978 --> 00:01:30,640 is defined solely to be the value of 仅定义为 27 00:01:30,840 --> 00:01:36,189 that maximum strategy now why why would 现在最大的策略为什么会 28 00:01:36,390 --> 00:01:37,390 we want to think about the maximum 我们想考虑最大 29 00:01:37,590 --> 00:01:43,168 strategy one can think of it either as a 策略可以将其视为一种 30 00:01:43,368 --> 00:01:46,019 simply a sort of a certain cautionary 只是某种警告 31 00:01:46,219 --> 00:01:49,808 maybe the other people will make some 也许其他人会做一些 32 00:01:50,009 --> 00:01:51,429 mistakes and not act in their own best 错误,而不是尽自己最大的努力 33 00:01:51,629 --> 00:01:54,969 interest maybe I'm not sure exactly what 兴趣也许我不确定到底是什么 34 00:01:55,170 --> 00:01:56,948 their payoffs are there a lot of 他们的收益很多 35 00:01:57,149 --> 00:01:58,918 interpretations or you can simply be 解释或者你可以简单地 36 00:01:59,118 --> 00:02:04,509 paranoid about about about them and 关于他们的偏执和 37 00:02:04,709 --> 00:02:06,939 think that there are to get you and you 认为有得到你和你 38 00:02:07,140 --> 00:02:08,380 know the you know the saying you know 知道你知道的话你知道 39 00:02:08,580 --> 00:02:09,509 even 甚至 40 00:02:09,709 --> 00:02:14,009 apparently they have enemies that's the 显然他们有敌人 41 00:02:14,209 --> 00:02:17,429 max min strategy and just a confuse 最大最小策略,只是一个混乱 42 00:02:17,628 --> 00:02:19,530 thing will also speak about the min max 事情也会谈到最小最大 43 00:02:19,729 --> 00:02:23,250 strategy the min max strategy is 策略最小最大策略是 44 00:02:23,449 --> 00:02:26,099 strategy against if you wish the other 反对对方是否愿意的策略 45 00:02:26,299 --> 00:02:28,920 player in the two-player game is the 两人游戏中的玩家是 46 00:02:29,120 --> 00:02:31,500 strategies that minimizes their payoff 使收益最小化的策略 47 00:02:31,699 --> 00:02:33,750 on the assumption that they're trying to 假设他们正在尝试 48 00:02:33,949 --> 00:02:36,659 maximize it and so here is the formal 最大化它,所以这是正式的 49 00:02:36,859 --> 00:02:39,209 definition the min max strategy for 定义最小最大策略 50 00:02:39,408 --> 00:02:40,920 player I is playing against the other 我正在与其他玩家对战的玩家 51 00:02:41,120 --> 00:02:43,890 guy who pre-dawn by minus I is the 减去我之前黎明的家伙是 52 00:02:44,090 --> 00:02:47,368 strategy that minimizes the maximum 最小化最大策略 53 00:02:47,568 --> 00:02:50,189 payoff as tempted by the other guy of 另一个人的诱惑所带来的回报 54 00:02:50,389 --> 00:02:54,360 the payoff to the other guy and the min 回报给另一个人和最小 55 00:02:54,560 --> 00:02:56,159 max value is simply the value of that 最大值就是那个的值 56 00:02:56,359 --> 00:03:01,319 mimic strategy the value to play one now 模仿策略现在玩一玩的价值 57 00:03:01,519 --> 00:03:08,159 why why would player one want to want to 为什么玩家会想要 58 00:03:08,359 --> 00:03:12,390 harm the other guy well you could it 伤害其他人就可以了 59 00:03:12,590 --> 00:03:15,270 could just be able to get him that's a 也许能够让他得到 60 00:03:15,469 --> 00:03:19,409 possibility or they could be playing a 可能性,或者他们可能正在玩 61 00:03:19,609 --> 00:03:23,118 zero-sum game and in a zero-sum game 零和游戏以及零和游戏 62 00:03:23,318 --> 00:03:26,599 hurting the other guy is tantamount to 伤害另一个人等于 63 00:03:26,799 --> 00:03:32,509 improving your own your own path and so 改善自己的道路等 64 00:03:32,709 --> 00:03:36,629 in the setting of zero-sum games max min 在零和游戏最大最小的设置中 65 00:03:36,829 --> 00:03:38,069 and min max strategy makes a lot of 最小最大策略使很多 66 00:03:38,269 --> 00:03:39,840 sense and in fact in a very famous 感,实际上是非常著名的 67 00:03:40,039 --> 00:03:44,159 theorem due to genre diamond it proved 钻石流派定理证明 68 00:03:44,359 --> 00:03:48,058 that in a zero-sum game by definition we 根据定义,在零和博弈中 69 00:03:48,258 --> 00:03:52,439 consider only two player such games any 只考虑两个玩家这样的游戏 70 00:03:52,639 --> 00:03:55,200 Nash equilibrium the player sees that 纳什均衡玩家看到 71 00:03:55,400 --> 00:03:57,539 payoff that is equal to both his max min 收益等于他的最大最小值 72 00:03:57,739 --> 00:04:04,259 value and his min max value and and that 值和他的最小最大值,以及 73 00:04:04,459 --> 00:04:07,050 means that so we will call it the value 意味着我们将其称为价值 74 00:04:07,250 --> 00:04:09,330 of the game the value for player one is 在游戏中,玩家一的价值是 75 00:04:09,530 --> 00:04:11,368 called the value of the game and that 称之为游戏的价值, 76 00:04:11,568 --> 00:04:13,709 means that the the set of maximum 表示最大集 77 00:04:13,908 --> 00:04:16,199 strategies are really the same a set of 一套策略真的一样 78 00:04:16,399 --> 00:04:18,329 the min max strategies that is try to 尝试的最小最大策略 79 00:04:18,528 --> 00:04:21,120 improve your worst-case situation is the 改善您最坏的情况是 80 00:04:21,319 --> 00:04:22,079 same as 如同 81 00:04:22,279 --> 00:04:24,389 to minimize the other guys best case 尽量减少其他人的最佳情况 82 00:04:24,589 --> 00:04:28,560 situation and any maximum strategy 情况和任何最大策略 83 00:04:28,759 --> 00:04:30,420 profile or minimax tragic profile 轮廓或极小悲剧轮廓 84 00:04:30,620 --> 00:04:33,120 because they're the same constitute a 因为它们相同构成了一个 85 00:04:33,319 --> 00:04:36,030 Nash equilibrium and furthermore those 纳什均衡,还有那些 86 00:04:36,230 --> 00:04:37,590 are all the Nash equilibria that exists 是所有存在的纳什均衡 87 00:04:37,790 --> 00:04:39,780 and so the payoffs it all Nash 所以回报全是纳什 88 00:04:39,980 --> 00:04:41,460 equilibria is the same maybe the value 均衡是一样的,也许价值 89 00:04:41,660 --> 00:04:46,980 of the game one way to get a concrete 游戏的一种获得具体方法的方法 90 00:04:47,180 --> 00:04:49,170 feel for it is graphically and here's 感觉是图形化的,这是 91 00:04:49,370 --> 00:04:51,509 the game of matching pennies this is a 匹配便士的游戏,这是一个 92 00:04:51,709 --> 00:04:56,160 game where you each of us chooses heads 我们每个人都选择头的游戏 93 00:04:56,360 --> 00:05:02,490 and tails on probability and if we if it 和概率的尾巴,如果我们 94 00:05:02,689 --> 00:05:06,930 comes up either if we've both if if the 如果我们两者都出现, 95 00:05:07,129 --> 00:05:10,259 result of our randomized ation I end up 我们随机化的结果我最终 96 00:05:10,459 --> 00:05:11,850 choosing head and you are you end up 选择头,你就结束了 97 00:05:12,050 --> 00:05:15,180 playing tail you win and confer and vice 打尾巴你赢了,赋予和副 98 00:05:15,379 --> 00:05:16,860 versa as if I chose tail and you head 反之亦然,就好像我选择了尾巴,而你却是头 99 00:05:17,060 --> 00:05:19,650 but if we both chose ahead of us just 但是如果我们都选择了我们 100 00:05:19,850 --> 00:05:22,050 tailed I win and so here are the payoffs 我赢了,所以这是回报 101 00:05:22,250 --> 00:05:25,468 you see here the strategy spaces this is 你在这里看到的策略空间是 102 00:05:25,668 --> 00:05:29,400 player 2 is kind of increasing their 玩家2有点增加他们的 103 00:05:29,600 --> 00:05:31,250 probability of playing heads and this is 打头的概率,这是 104 00:05:31,449 --> 00:05:34,770 player 1 and what this dimensioned 播放器1的大小 105 00:05:34,970 --> 00:05:37,319 you have the value of the K of the path 您拥有路径K的值 106 00:05:37,519 --> 00:05:41,420 to day one and the only Nash equilibria 到第一天,唯一的纳什均衡 107 00:05:41,620 --> 00:05:45,960 is for both to randomized 5050 it's just 是为了将两者随机化为5050 108 00:05:46,160 --> 00:05:50,060 right here it's kind of intriguing the 在这里,这很有意思 109 00:05:50,259 --> 00:05:55,650 the the three-dimensional structure in 在三维结构 110 00:05:55,850 --> 00:06:01,560 this way and you sort of see that it's 这样,您会发现它是 111 00:06:01,759 --> 00:06:03,088 got to be an equilibrium in the sense 必须在某种意义上达到平衡 112 00:06:03,288 --> 00:06:08,278 that player player 1 could be moving 玩家1可能正在移动 113 00:06:08,478 --> 00:06:13,710 along this this curve but if as he does 沿着这条曲线,但如果他这样做 114 00:06:13,910 --> 00:06:18,300 it his payoffs would only drop and so 它的收益只会下降,所以 115 00:06:18,500 --> 00:06:20,009 he's trying to maximize the value would 他正在努力使价值最大化 116 00:06:20,209 --> 00:06:23,850 do it and conversely player 2 can only 这样做,相反,玩家2只能 117 00:06:24,050 --> 00:06:26,460 traverse around along this but if he 沿着这条路走,但如果他 118 00:06:26,660 --> 00:06:28,439 does that the payoffs would only 这样做的收益只会 119 00:06:28,639 --> 00:06:31,379 increase and he's trying to minimize the 增加,他正在尝试最小化 120 00:06:31,579 --> 00:06:32,920 value so 如此重视 121 00:06:33,120 --> 00:06:36,160 so you get a stable point which is which 所以你会得到一个稳定点 122 00:06:36,360 --> 00:06:38,590 for obvious reasons is called a saddle 由于明显的原因被称为马鞍 123 00:06:38,790 --> 00:06:42,340 point so although there are general 这样说虽然有一般 124 00:06:42,540 --> 00:06:44,500 purpose procedure is for finding a Nash 目的程序是寻找纳什 125 00:06:44,699 --> 00:06:48,490 equilibrium in particular in in two by 特别是在两个方面的均衡 126 00:06:48,689 --> 00:06:53,370 two games we can use the max min 两个游戏,我们可以使用最大最小值 127 00:06:53,569 --> 00:06:58,090 definition to to find it directly in 可以直接在其中找到的定义 128 00:06:58,290 --> 00:07:01,800 zero-sum games and let's see how it 零和游戏,让我们看看它如何 129 00:07:02,000 --> 00:07:05,340 happens in the game we'll call the 发生在游戏中,我们称之为 130 00:07:05,540 --> 00:07:08,379 penalty kick game so in this game we 罚点球游戏,所以在这个游戏中我们 131 00:07:08,579 --> 00:07:13,920 have a a penalty kicker and a goalie 有一个点球手和一个守门员 132 00:07:14,120 --> 00:07:17,939 it's a zero-sum game and the goal of the 这是一个零和游戏,目标是 133 00:07:18,139 --> 00:07:21,189 kicker is to score goal and the goal the 踢球者是进球,而进球是 134 00:07:21,389 --> 00:07:24,629 goal of the goalie is to prevent it and 守门员的目标是预防和 135 00:07:24,829 --> 00:07:26,710 let's assume that they each have two 假设它们每个都有两个 136 00:07:26,910 --> 00:07:29,410 strategies kick to the left and kick to 策略向左踢并向 137 00:07:29,610 --> 00:07:31,960 the right to a kicker and jump to the 有权获得踢球权并跳至 138 00:07:32,160 --> 00:07:33,639 left and jump to the right for the 向左跳并向右跳 139 00:07:33,839 --> 00:07:38,110 goalie the payoffs will be the each of 守门员的回报将是每个 140 00:07:38,310 --> 00:07:41,040 those will determine a probability of 这些将决定 141 00:07:41,240 --> 00:07:45,338 the kicker scoring a goal and we'll have 射门得分,我们将拥有 142 00:07:45,538 --> 00:07:48,100 that probability being the payoff the 那是回报的可能性 143 00:07:48,300 --> 00:07:49,949 value of the game that is a pair of two 是两个的游戏价值 144 00:07:50,149 --> 00:07:54,968 pair 1 and therefore minus the after we 对1,因此减去之后 145 00:07:55,168 --> 00:07:58,509 player 2 namely the goalie and so here 玩家2即守门员,所以在这里 146 00:07:58,709 --> 00:08:01,360 they are so for example if the kicker 例如踢球者 147 00:08:01,560 --> 00:08:03,400 kicks left and the goalie guesses 向左踢,守门员猜测 148 00:08:03,600 --> 00:08:06,069 correctly and jump left also then the 正确地向左跳然后 149 00:08:06,269 --> 00:08:08,770 goalie has not too bad a chance of 守门员的机会还不错 150 00:08:08,970 --> 00:08:13,319 suffering the shot namely probably 0.4 遭受打击大概是0.4 151 00:08:13,519 --> 00:08:16,838 if he jumps to the wrong side his 如果他跳到错误的一侧,他的 152 00:08:17,038 --> 00:08:20,259 probability of stopping it is much lower 停止它的可能性要低得多 153 00:08:20,459 --> 00:08:25,360 it's point 2 similarly if the kicker 如果踢球者也类似地指向第2点 154 00:08:25,560 --> 00:08:28,860 decides to kick to the right if the 决定向右踢 155 00:08:29,060 --> 00:08:31,750 goalie guess is wrong 守门员猜错了 156 00:08:31,949 --> 00:08:33,939 his probability is low even lower than 他的概率甚至比 157 00:08:34,139 --> 00:08:36,218 if you get strong in the other case and 如果您在其他情况下变强,并且 158 00:08:36,418 --> 00:08:38,049 if you guys right is probability is 如果你们是对的,则概率是 159 00:08:38,250 --> 00:08:40,328 higher although not quite a size that 较高,尽管尺寸不大 160 00:08:40,528 --> 00:08:44,439 had to get right in the left case C's is 在C的左情况下必须正确 161 00:08:44,639 --> 00:08:45,609 bitter 苦 162 00:08:45,809 --> 00:08:49,240 at stopping shots when the kicker kicks 踢脚时停止投篮 163 00:08:49,440 --> 00:08:53,879 to the left so how does the clicker 向左,那么点击器如何 164 00:08:54,080 --> 00:08:56,019 maximize these minimum that's what we're 最大化这些最小值,这就是我们的目标 165 00:08:56,220 --> 00:09:00,578 after so here is the expression right we 之后,这里是正确的表达 166 00:09:00,778 --> 00:09:04,620 want the maximum value of the following 想要以下的最大值 167 00:09:04,820 --> 00:09:08,319 so they each have some mixed strategy of 所以他们每个人都有一些混合策略 168 00:09:08,519 --> 00:09:10,719 playing left and right with some 和一些左右演奏 169 00:09:10,919 --> 00:09:13,479 probability so s1 L is a probability s1 L是一个概率 170 00:09:13,679 --> 00:09:16,509 that the kicker kicks to the left and s 踢脚向左踢,并且 171 00:09:16,710 --> 00:09:19,740 2 L is the probability that the goalie 2 L是守门员的概率 172 00:09:19,940 --> 00:09:22,659 jumps to the left and as we saw in that 跳到左边,就像我们看到的那样 173 00:09:22,860 --> 00:09:27,609 case the value is 0.6 and similarly the 情况下该值为0.6,类似地 174 00:09:27,809 --> 00:09:30,370 value 8.8 ISINs if they if we end up in 如果它们以8.8表示,则价值8.8 ISIN 175 00:09:30,570 --> 00:09:34,719 this situation point 9 and point 7 in 这种情况下的第9点和第7点 176 00:09:34,919 --> 00:09:39,009 these situations okay so this is the 这些情况好吧,这就是 177 00:09:39,210 --> 00:09:42,819 expression that we somehow need to 我们需要某种表达的表情 178 00:09:43,019 --> 00:09:51,159 compute so what is the minimum that the 计算,那么最小的是 179 00:09:51,360 --> 00:09:52,829 kicker should keep in mind 踢球者应牢记 180 00:09:53,029 --> 00:09:57,399 so the kicker says I'm going to pick I'm 所以踢手说我要选我 181 00:09:57,600 --> 00:10:01,629 gonna play my strategy s 1 whatever it 无论如何都会发挥我的策略 182 00:10:01,830 --> 00:10:05,379 is when I play it player 2 is gonna play 当我播放时,播放器2将播放 183 00:10:05,580 --> 00:10:11,889 s 2 so as to minimize my payoff so let s 2以使我的收益最小化 184 00:10:12,089 --> 00:10:16,240 me write down this entire expression and 我写下整个表情 185 00:10:16,440 --> 00:10:19,740 I'm just this is simply copying it over 我只是将其复制过来而已 186 00:10:19,940 --> 00:10:26,740 replacing its expression such as s 1r by 将其表达式替换为s 1r 187 00:10:26,940 --> 00:10:31,839 1 minus s 1 l that's all it's doing and 1减去1公升 188 00:10:32,039 --> 00:10:39,120 the same thing for s 2 R and so we have 对于s 2 R同样的事情,所以我们有 189 00:10:39,320 --> 00:10:42,339 this is our expression and player 1s 这是我们的表情和玩家1 190 00:10:42,539 --> 00:10:45,609 ain't enough water player to do if I 如果我没有足够的水玩家来做 191 00:10:45,809 --> 00:10:50,559 were to play my strategy s 1 and this is 玩我的策略s 1,这是 192 00:10:50,759 --> 00:10:52,779 simply rearranging the terms nothing 简单地重新排列术语 193 00:10:52,980 --> 00:10:56,740 else going on here as a function of s 2 否则这里是s 2的函数 194 00:10:56,940 --> 00:10:58,479 L because L因为 195 00:10:58,679 --> 00:11:01,599 player one says I'm gonna put I'm gonna 玩家一说我要放我去 196 00:11:01,799 --> 00:11:04,419 pick s one what would s to be best 选择一个最好的 197 00:11:04,620 --> 00:11:08,319 response namely its minimum and so this 响应就是它的最小值,所以这 198 00:11:08,519 --> 00:11:10,870 is arranging it as a function of s two 正在将其安排为s两个的函数 199 00:11:11,070 --> 00:11:14,259 strategy and now all it remained is to 策略,现在剩下的就是 200 00:11:14,460 --> 00:11:17,258 look for the minimum of the strategy and 寻找最小的策略, 201 00:11:17,458 --> 00:11:21,099 the minimum is taken by luck taking the 最小的是运气 202 00:11:21,299 --> 00:11:22,779 first derivative with respect to s two 关于s 2的一阶导数 203 00:11:22,980 --> 00:11:26,589 holding s 1 is a constant and so we have 持有s 1是一个常数,所以我们有 204 00:11:26,789 --> 00:11:30,179 this expression and then we solve for 这个表达式,然后我们解决 205 00:11:30,379 --> 00:11:33,399 for s 1 and we get that s 1 of L is is 对于s 1,我们得到L的s 1是 206 00:11:33,600 --> 00:11:38,439 1/2 and therefore s 1 of R is a half as 1/2,因此R的s 1为 207 00:11:38,639 --> 00:11:43,240 well and so by this maximum calculation 好吧,这样最大的计算 208 00:11:43,440 --> 00:11:47,349 we see that player the kicker figures 我们看到那个球员的数字 209 00:11:47,549 --> 00:11:48,879 out that in equilibrium 在平衡中 210 00:11:49,080 --> 00:11:51,609 they better randomize 1/2 1/2 between 他们最好将1/2 1/2 211 00:11:51,809 --> 00:11:53,698 left and right 左和右 212 00:11:53,899 --> 00:11:59,349 what does what is the goalie kind of 什么是守门员 213 00:11:59,549 --> 00:12:02,258 figure out well he was trying to 弄清楚他正在尝试 214 00:12:02,458 --> 00:12:05,078 minimize the kickers maximum right 最小化踢脚者最大右脚 215 00:12:05,278 --> 00:12:06,399 that's one was a look of it he's doing 那是他正在做的一件事 216 00:12:06,600 --> 00:12:10,089 the min/max strategy and so here is the 最小/最大策略,所以这是 217 00:12:10,289 --> 00:12:12,549 minimax strategy for player 2 just 玩家2的minimax策略 218 00:12:12,750 --> 00:12:15,429 writing it down and as before we'll 写下来,像以前一样 219 00:12:15,629 --> 00:12:22,959 simply will simply rewrite it's one of 只是将简单地重写它是其中之一 220 00:12:23,159 --> 00:12:25,328 our heart as 1 minus s 1 of L and so on 我们的心是L的1减去1 221 00:12:25,528 --> 00:12:29,889 for s 2 of our and we'll rearrange the 对于我们的s 2,我们将重新排列 222 00:12:30,089 --> 00:12:33,370 term this time as a function of waist 1 这次根据腰围1 223 00:12:33,570 --> 00:12:38,078 s 1 of L because player 2 is saying L之1,因为玩家2在说 224 00:12:38,278 --> 00:12:40,599 player player one player two says player 球员球员一球员二说球员 225 00:12:40,799 --> 00:12:45,339 1 is whatever I choose namely s 2 player 1是我选择的2位玩家 226 00:12:45,539 --> 00:12:47,109 1 will want to bet respond namely to 1将要投注回应 227 00:12:47,309 --> 00:12:49,419 maximize so if you write it down as a 最大化,所以如果您将其记为 228 00:12:49,620 --> 00:12:52,959 function of s 1 choice and now figuring s 1选择的功能,现在确定 229 00:12:53,159 --> 00:12:55,179 out what the maximum would be well 找出最大的好 230 00:12:55,379 --> 00:12:58,389 here's the maximum and when you solve 这是最大值,当您解决时 231 00:12:58,589 --> 00:13:02,169 for L 2 now you get that the 对于L 2现在你得到 232 00:13:02,370 --> 00:13:05,919 randomization in equilibrium for for s 2 s 2的平衡随机化 233 00:13:06,120 --> 00:13:10,319 is 4 if player 2 is 1/4 and 3/4 如果玩家2是1/4和3/4,则为4 234 00:13:10,519 --> 00:13:13,839 so this illustrates how we can use the 所以这说明了我们如何使用 235 00:13:14,039 --> 00:13:18,599 maxima theorem to actually compute the 极大定理实际计算 236 00:13:18,799 --> 00:13:22,149 the Nash equilibrium is zero-sum games 纳什均衡是零和博弈 237 00:13:22,350 --> 00:13:25,120 at least in two by two games in general 一般至少两乘两局 238 00:13:25,320 --> 00:13:29,109 we can use the min/max theorem to 我们可以使用最小/最大定理 239 00:13:29,309 --> 00:13:34,479 compute the equilibria of zero-sum game 计算零和博弈的均衡 240 00:13:34,679 --> 00:13:36,519 and we do it 而我们做到了 241 00:13:36,720 --> 00:13:41,969 by simply laying out a linear program 通过简单地布置一个线性程序 242 00:13:42,169 --> 00:13:46,229 that captures the game and here it is so 捕获了游戏,这里就是 243 00:13:46,429 --> 00:13:50,679 u 1 star is going to be the value of the u 1星将成为 244 00:13:50,879 --> 00:13:54,819 game that is that payoff to play 1 in 就是玩1的回报的游戏 245 00:13:55,019 --> 00:13:59,139 equilibrium and so we're gonna specify 平衡,所以我们要指定 246 00:13:59,340 --> 00:14:01,359 from players to point of view we could 从玩家的角度来看,我们可以 247 00:14:01,559 --> 00:14:02,669 have done it the other way around also 反过来也做到了 248 00:14:02,870 --> 00:14:06,539 so what player 2 is saying is simply 所以玩家2所说的只是 249 00:14:06,740 --> 00:14:13,870 says for each of the actions of player 1 说出玩家1的每个动作 250 00:14:14,070 --> 00:14:16,959 each action that player 1 might consider 玩家1可能考虑的每个动作 251 00:14:17,159 --> 00:14:23,049 I want to find a mixed strategy s 2 so 我想找到一个混合策略2 252 00:14:23,250 --> 00:14:25,990 here's my mixed strategy s 2 it will 这是我的混合策略2 253 00:14:26,190 --> 00:14:30,509 look at all my pure strategies K and 看看我所有的纯策略K和 254 00:14:30,710 --> 00:14:34,269 make sure that the probability that's a 确保这是一个概率 255 00:14:34,470 --> 00:14:35,799 probability distribution over those some 这些概率的分布 256 00:14:36,000 --> 00:14:38,609 they sum to one and they're not negative 他们加一而不是负数 257 00:14:38,809 --> 00:14:41,379 so what I'd like to do is that the best 所以我想做的就是最好的 258 00:14:41,580 --> 00:14:46,299 response in my strategy by player 1 for 我的策略由玩家1回应 259 00:14:46,500 --> 00:14:48,189 any of these actions will never exceed 这些动作中的任何一个都不会超过 260 00:14:48,389 --> 00:14:50,289 this value of the game because I'm 游戏的价值,因为我 261 00:14:50,490 --> 00:14:53,069 trying to minimize so I'm going to find 试图最小化,所以我要找到 262 00:14:53,269 --> 00:14:58,199 the lowest you that has a property that 拥有最低财产的最低 263 00:14:58,399 --> 00:15:00,669 player 1 doesn't have a profitable 玩家1没有获利 264 00:15:00,870 --> 00:15:08,679 deviation by any of his of his these 他的任何一个偏离这些 265 00:15:08,879 --> 00:15:12,539 pure strategies so when I look at the 纯粹的策略,所以当我看着 266 00:15:12,740 --> 00:15:22,149 payoff for player 2 when I play a 2k and 我玩2k时玩家2的收益 267 00:15:22,350 --> 00:15:23,899 he plays a 他扮演一个 268 00:15:24,100 --> 00:15:25,849 J that would that Jaden I'm considering 我会考虑让Jaden考虑的J 269 00:15:26,049 --> 00:15:28,248 right now and I multiplied the 现在,我乘以 270 00:15:28,448 --> 00:15:30,618 probability of in my mixed strategy 在我的混合策略中的概率 271 00:15:30,818 --> 00:15:35,089 playing a 2k I don't want that play one 玩2k我不想玩 272 00:15:35,289 --> 00:15:37,549 that other player player one to have a 另一个玩家拥有一个 273 00:15:37,750 --> 00:15:40,099 profitable deviation so it's got to be 有利可图的偏差,所以它必须是 274 00:15:40,299 --> 00:15:42,618 that his expected payoff will be no 他的预期收益不会是 275 00:15:42,818 --> 00:15:47,649 greater than the value you want star so 大于您想要的星号 276 00:15:47,850 --> 00:15:50,419 clearly this is a correct formulation of 显然,这是正确的表述 277 00:15:50,620 --> 00:15:52,938 the game and it is a linear program as 游戏,它是一个线性程序 278 00:15:53,139 --> 00:15:56,919 we know their programs are efficiently 我们知道他们的计划是有效的 279 00:15:57,120 --> 00:15:59,118 solvable 可解 280 00:15:59,318 --> 00:16:05,479 in theory by a interior methods that is 从理论上讲是通过内部方法 281 00:16:05,679 --> 00:16:08,828 provably polynomial in practice by 在实践中证明多项式 282 00:16:09,028 --> 00:16:13,339 procedures that are worst-case 最坏情况的程序 283 00:16:13,539 --> 00:16:18,539 exponential but in practice work well 指数的,但实际上效果很好