How to outsmart the Prisoners Dilemma Lucas Husted

Two perfectly rational gingerbread men,
Crispy and Chewy,

are out strolling
when they’re caught by a fox.

Seeing how happy they are,
he decides that,

instead of simply eating them,

he’ll put their friendship
to the test with a cruel dilemma.

He’ll ask each gingerbread man whether
he’d opt to Spare or Sacrifice the other.

They can discuss,

but neither will know what the other
chose until their decisions are locked in.

If both choose to spare the other, the fox
will eat just one of each of their limbs;

if one chooses to spare
while the other sacrifices,

the sparer will be fully eaten,

while the traitor will run away
with all his limbs intact.

Finally, if both choose to sacrifice,
the fox will eat 3 limbs from each.

In game theory, this scenario
is called the “Prisoner’s Dilemma.”

To figure out how these gingerbread men
will act in their perfect rationality,

we can map the outcomes of each decision.

The rows represent Crispy’s choices,
and the columns are Chewy’s.

Meanwhile, the numbers in each cell

represent the outcomes
of their decisions,

as measured in the number
of limbs each would keep:

So do we expect their friendship
to last the game?

First, let’s consider Chewy’s options.

If Crispy spares him, Chewy can run
away scot-free by sacrificing Crispy.

But if Crispy sacrifices him,

Chewy can keep one of his limbs
if he also sacrifices Crispy.

No matter what Crispy decides,

Chewy always experiences the best outcome
by choosing to sacrifice his companion.

The same is true for Crispy.

This is the standard conclusion
of the Prisoner’s Dilemma:

the two characters will
betray one another.

Their strategy to unconditionally
sacrifice their companion

is what game theorists
call the “Nash Equilibrium,"

meaning that neither can gain
by deviating from it.

Crispy and Chewy act accordingly

and the smug fox runs off
with a belly full of gingerbread,

leaving the two former friends
with just one leg to stand on.

Normally, this is where
the story would end,

but a wizard happened to be watching
the whole mess unfold.

He tells Crispy and Chewy that,
as punishment for betraying each other,

they’re doomed to repeat this dilemma
for the rest of their lives,

starting with all four limbs
at each sunrise.

Now what happens?

This is called an Infinite Prisoner’s
Dilemma, and it’s a literal game changer.

That’s because the gingerbread men
can now use their future decisions

as bargaining chips for the present ones.

Consider this strategy: both agree
to spare each other every day.

If one ever chooses to sacrifice,

the other will retaliate by choosing
“sacrifice” for the rest of eternity.

So is that enough to get these
poor sentient baked goods

to agree to cooperate?

To figure that out, we have to factor
in another consideration:

the gingerbread men probably care
about the future

less than they care about the present.

In other words, they might discount

how much they care about their future
limbs by some number,

which we’ll call delta.

This is similar to the idea of inflation
eroding the value of money.

If delta is one half,

on day one they care about day 2 limbs
half as much as day 1 limbs,

day 3 limbs 1 quarter as much
as day 1 limbs, and so on.

A delta of 0 means that they don’t care
about their future limbs at all,

so they’ll repeat their initial choice
of mutual sacrifice endlessly.

But as delta approaches 1,
they’ll do anything possible

to avoid the pain of infinite triple limb
consumption,

which means they’ll choose
to spare each other.

At some point in between
they could go either way.

We can find out where that point is

by writing the infinite series
that represents each strategy,

setting them equal to each other,
and solving for delta.

That yields 1/3, meaning that as long
as Crispy and Chewy care about tomorrow

at least 1/3 as much as today,

it’s optimal for them
to spare and cooperate forever.

This analysis isn’t unique
to cookies and wizards;

we see it play out in real-life situations

like trade negotiations
and international politics.

Rational leaders must assume
that the decisions they make today

will impact those of their adversaries
tomorrow.

Selfishness may win out in the short-term,
but with the proper incentives,

peaceful cooperation is not only possible,
but demonstrably and mathematically ideal.

As for the gingerbread men,
their eternity may be pretty crumby,

but so long as they go out on a limb,

their friendship will never
again be half-baked.

两个完全理性的姜饼人
Crispy 和 Chewy

在外面散步
时被一只狐狸抓住了。

看到他们有多开心,
他决定,

而不是简单地吃掉他们,

而是
用残酷的困境来考验他们的友谊。

他会问每个姜饼人
他是选择备用还是牺牲另一个。

他们可以讨论,

但在
他们的决定被锁定之前,他们都不知道对方选择了什么。

如果双方都选择放过对方,狐狸
只会吃掉他们的四肢之一;

一方
牺牲,一方牺牲,一方牺牲

,一方将被吃光,

而叛徒则
四肢完好逃跑。

最后,如果双方都选择牺牲
,狐狸会吃掉每人的 3 个肢体。

在博弈论中,这种情况
被称为“囚徒困境”。

为了弄清楚这些姜饼人
将如何以完美的理性行事,

我们可以绘制出每个决定的结果。

行代表 Crispy 的选择
,列是 Chewy 的。

同时,每个单元格中的数字

代表
他们决定的结果,


每个人将保留的肢体数量来衡量:

那么我们是否希望他们的友谊
能够持续游戏?

首先,让我们考虑一下 Chewy 的选择。

如果 Crispy 放过他,
Chewy 可以通过牺牲 Crispy 逃跑。

但是如果 Crispy 牺牲了他,如果

Chewy 也牺牲 Crispy,他可以保留他的一条四肢

无论 Crispy 做出什么决定,

Chewy 总是
通过选择牺牲他的同伴来获得最好的结果。

Crispy 也是如此。

这是囚徒困境的标准结论

:两个角色会
互相背叛。

他们无条件
牺牲同伴的策略

被博弈论者
称为“纳什均衡”,

这意味着任何一方都无法通过偏离它来获得收益

。Crispy 和 Chewy 采取相应的行动,

而自鸣得意的狐狸
带着满肚子的姜饼跑掉了,

留下了前两个
只有一条腿可以站立的朋友。

通常情况下,这
就是故事的结局,

但一个巫师碰巧看到
了整个混乱局面的展开。

他告诉脆皮和耐嚼,
作为对彼此背叛的惩罚,

他们注定要
在他们的余生中重复这个困境,

从每个日出时的所有四个肢体
开始。

现在会发生什么?

这被称为无限囚徒
困境,它是一个字面上的游戏规则改变者。

那是因为姜饼人
现在可以将他们未来的决定

作为 为现在的人讨价还价。

考虑一下这个策略:双方同意
每天都饶恕对方。

如果一个人选择牺牲

,另一个人将通过选择
“牺牲”来报复。

那么这足以让这些
可怜的有感觉的烘焙食品

同意合作吗?

为了弄清楚这一点,我们必须
考虑另一个因素

:姜饼人可能
对未来的

关心不如对现在的关心。

换句话说,他们可能会将

他们对未来四肢的关心程度打折
一些数字

,我们称之为增量。

这类似于通货膨胀
侵蚀货币价值的想法。

如果 delta 是二分之一,那么

在第一天他们关心的第 2 天肢体
是第 1 天肢体的一半,第 3 天的肢体是第

1 天肢体的
四分之一,依此类推。

delta 为 0 意味着他们根本不
关心他们未来的肢体,

所以他们会无休止地重复他们最初
的相互牺牲的选择。

但随着 delta 接近 1,
他们会尽一切

可能避免无限三肢消耗的痛苦

这意味着他们会
选择互相饶恕。

在两者之间的某个时刻,
他们可以选择任何一种方式。

我们可以

通过编写
代表每个策略的无限级数、

将它们设置为彼此相等
并求解 delta 来找出该点在哪里。

这产生了 1/3,这意味着
只要 Crispy 和 Chewy 对明天的关心

至少是今天的 1/3,

那么他们最好
永远保持空闲和合作。

这种分析并不是
cookie 和向导所独有的;

我们看到它在

贸易谈判
和国际政治等现实生活中发挥作用。

理性的领导者必须
假设他们今天做出的决定

将影响他们明天的对手的决定

自私可能会在短期内胜出,
但有了适当的激励措施,

和平合作不仅是可能的,
而且在数学上显然是理想的。

对于姜饼人来说,
他们的永恒可能很脆弱,

但只要他们四处走动,

他们的友谊就永远不会
再半生不熟。