Homework 2

Starting from:

$29.99

Homework 2

Homework 2
Question 1: Minimax
Consider the zero-sum game tree shown below. Triangles that point up, such as at the
top node (root), represent choices for the maximizing player; triangles that point down
represent choices for the minimizing player. Outcome values for the maximizing player
are listed for each leaf node, represented by the values in squares at the bottom of the
tree. Assuming both players act optimally, carry out the minimax search algorithm.
Write the values for the letter nodes in your “answer.txt” like A1B2C3D4. Please pay
attention to the format of your answer, otherwise the online judge may give wrong
grades.
Question 2: Expectiminimax
Consider the game tree shown below. As in the previous problem, triangles that point
up, such as the top node (root), represent choices for the maximizing player; triangles
that point down represent choices for the minimizing player. The circular nodes
represent chance nodes in which each of the possible actions may be taken with equal
probability. The square nodes at the bottom represent leaf nodes. Assuming both players
act optimally, carry out the expectiminimax search algorithm. Write the values for the
letter nodes in your “answer.txt” like A15B12C6D3…G12. Please pay attention to the
format of your answer, otherwise the online judge may give wrong grades.
Question 3: Unknown Leaf Value
Consider the following game tree, where one of the leaves has an unknown payoff,
x. Player 1 moves first, and attempts to maximize the value of the game.
Each of the next 3 questions asks you to write a constraint on x specifying the set of
values it can take. In your constraints, you can use the letter x, integers, and the
symbols > and <. If x has no possible values, write 'None'. If x can take on all values,
write 'All'. As an example, if you think x can take on all values larger than 16, you
should enter x >16.
1) Assume Player 2 is a minimizing agent and Player 1 knows this. For what values
of x is Player 1 guaranteed to choose Action 1?
2) Assume Player 2 chooses actions at random with each action having equal
probability and Player 1 knows this. For what values of x is Player 1 guaranteed to
choose Action 1?
3) Denote the minimax value of the tree as the value of the root when Player 1 is the
maximizer and Player 2 is the minimizer. Denote the expectimax value of the tree
as the value of the root when Player 1 is the maximizer and Player 2 chooses actions
at random with equal probability. For what values of x is the minimax value of the
tree worth more than the expectimax value of the tree?
4) Is it possible to have a game, where the minimax value is strictly larger than the
expectimax value?
Question 4: Alpha-Beta Pruning
Consider the game tree shown below. Triangles that point up, such as at the top node
(root), represent choices for the maximizing player; triangles that point down represent
choices for the minimizing player. Assuming both players act optimally, use alpha-beta
pruning to find the value of the root node. The search goes from left to right; when
choosing which child to visit first, choose the left-most unvisited child.
In your “answer.txt” file, please write the values for the letter nodes in one line. And
then write the leaf nodes that don't get visited due to pruning in the following line. Your
answer to this question may look like:
A1B2C3D4
48
Hint: Note that the value of a node where pruning occurs is not necessarily the
maximum or minimum (depending on which node) of its children. When you prune on
conditions 𝑉𝑉 > 𝛽𝛽 or 𝑉𝑉 < 𝛼𝛼, assume that the value of the node is 𝑉𝑉.
Question 5: Possible Pruning
Assume we run 𝛼𝛼 − 𝛽𝛽 pruning, expanding successors from left to right, on a game
with tree as shown in the figures below.
Hint: Perhaps the simplest check is as follows: pruning of children of a minimizer node
𝑚𝑚 is possible (for some assignment to the terminal nodes), when both of the following
conditions are met: (i) the value of another child of 𝑚𝑚 has already been determined,
(ii) somewhere on the path from 𝑚𝑚 to the root node, there is a maximizer node 𝑀𝑀 for
which an alternative option has already been explored. The pruning will then happen if
any such alternative option for the maximizer had a higher value than the value of the
"another child" of 𝑚𝑚 for which the value was already determined.
For the following two parts, which of the statements are true? Please write your
answer to each part in a line, like
AB
B
A. There exists an assignment of utilities to the terminal nodes such that no pruning
will be achieved (shown in Figure (a)).
B. There exists an assignment of utilities to the terminal nodes such that the pruning
shown in Figure (b) will be achieved.
C. There exists an assignment of utilities to the terminal nodes such that the pruning
shown in Figure (c) will be achieved.
D. None of the above.
A. There exists an assignment of utilities to the terminal nodes such that the pruning
shown in Figure (d) will be achieved.
B. There exists an assignment of utilities to the terminal nodes such that the pruning
shown in Figure (e) will be achieved.
C. There exists an assignment of utilities to the terminal nodes such that the pruning
shown in Figure (f) will be achieved.
D. None of the above.
Question 6: Suboptimal Strategies
Player MAX and player MIN are playing a zero-sum game with a finite number of
possible moves. MAX calculates the minimax value of the root to be 𝑀𝑀 . You may
assume that at every turn, each player has at least 2 possible actions. You may also
assume that a different sequence of moves will always lead to a different score (i.e., no
two terminal nodes have the same score).
1) Which of the following statements are true?
A. Assume MIN is playing sub-optimally at every turn. If MAX plays according
to the minimax strategy, the outcome of the game could be less than 𝑀𝑀.
B. Assume MIN is playing sub-optimally at every turn, but MAX does not know
this. The outcome of the game could be larger than 𝑀𝑀 (i.e. better for MAX).
2) For this question, assume that MIN is playing randomly (with a uniform
distribution) at every turn, and MAX knows this. Then which of the following
statements are true?
A. There exists a policy for MAX such that MAX can guarantee a better outcome
than 𝑀𝑀.
B. There exists a policy for MAX such that MAX's expected outcome is better
than 𝑀𝑀.
C. To maximize his or her expected outcome, MAX should play according to the
minimax strategy (i.e. the strategy that assumes MIN is playing optimally).
3) Which of the following statements are true?
A. There exists a policy for MAX such that MAX can guarantee a better outcome
than 𝑀𝑀.
B. There exists a policy for MAX such that MAX's expected outcome is better
than 𝑀𝑀.
C. To maximize his or her expected outcome, MAX should play according to the
minimax strategy (i.e. the strategy that assumes MIN is playing optimally).
4) Which of the following statements are true?
A. Assume MIN is playing sub-optimally at every turn, and MAX knows exactly
how MIN will play. There exists a policy for MAX to guarantee a better
outcome than 𝑀𝑀.
B. Assume MIN is playing sub-optimally at every turn. MAX following the
minimax policy will guarantee a better outcome than 𝑀𝑀.
Question 7: Rationality of Utilities
1) Consider a lottery 𝐿𝐿 = [0.25, 𝐴𝐴; 0.3, 𝐵𝐵; 0.35, 𝐶𝐶; 0.1,𝐷𝐷], where the utility values of
each of the outcomes are 𝑈𝑈(𝐴𝐴) = 4, 𝑈𝑈(𝐵𝐵) = 3, 𝑈𝑈(𝐶𝐶) = 2, 𝑈𝑈(𝐷𝐷) = 5. What is
the utility of this lottery, 𝑈𝑈(𝐿𝐿)?
2) Consider a lottery 𝐿𝐿1 = [0.25, 𝐴𝐴; 0.75, 𝐿𝐿2] , where 𝑈𝑈(𝐴𝐴) = 6 , and 𝐿𝐿2 =
[0.5, 𝑋𝑋; 0.5, 𝑌𝑌] is a lottery, and 𝑈𝑈(𝑋𝑋) = 4, 𝑈𝑈(𝑌𝑌) = 0. What is the utility of the
first lottery, 𝑈𝑈(𝐿𝐿1)?
3) Assume 𝐴𝐴 ≻ 𝐵𝐵 , 𝐵𝐵 ≻ 𝐿𝐿 , where 𝐿𝐿 = [0.5, 𝐶𝐶; 0.5,𝐷𝐷] , and 𝐷𝐷 ≻ 𝐴𝐴 . Assuming
rational preferences, which of the following statements are guaranteed to be true?
A. 𝐴𝐴 ≻ 𝐿𝐿
B. 𝐴𝐴 ≻ 𝐶𝐶
C. 𝐴𝐴 ≻ 𝐷𝐷
D. 𝐵𝐵 ≻ 𝐶𝐶
E. 𝐵𝐵 ≻ 𝐷𝐷
Question 8: Certainty Equivalent Values
Consider the utility function shown below.
Under the above utility function, what is the certainty equivalent monetary value in
dollars ($) of the lottery [0.6, $0; 0.4, $100]?
i.e., what is 𝑋𝑋 such that 𝑈𝑈($𝑋𝑋) = 𝑈𝑈([0.6, $0; 0.4, $100])?
Hint: Keep in mind that 𝑈𝑈([𝑝𝑝, 𝐴𝐴; 1 − 𝑝𝑝, 𝐵𝐵]) is not equal to 𝑈𝑈(𝑝𝑝𝑝𝑝 + (1 − 𝑝𝑝)𝐵𝐵).
Question 9: Preferences and Utilities
Our Pacman board now has food pellets of 3 different sizes – pellet 𝑃𝑃1 of radius 1,
𝑃𝑃2 of radius 2 and 𝑃𝑃3 of radius 3. In different moods, Pacman has different
preferences among these pellets. In each of the following questions, you are given
Pacman's preference for the different pellets. From among the options pick the utility
functions that are consistent with Pacman's preferences, where each utility function
𝑈𝑈(𝑟𝑟) is given as a function of the pellet radius, and is defined over non-negative values
of 𝑟𝑟.
1) 𝑃𝑃1 ∼ 𝑃𝑃2 ∼ 𝑃𝑃3
A. 𝑈𝑈(𝑟𝑟) = 0
B. 𝑈𝑈(𝑟𝑟) = 3
C. 𝑈𝑈(𝑟𝑟) = 𝑟𝑟
D. 𝑈𝑈(𝑟𝑟) = 2𝑟𝑟 + 4
E. 𝑈𝑈(𝑟𝑟) = −𝑟𝑟
F. 𝑈𝑈(𝑟𝑟) = 𝑟𝑟2
G. 𝑈𝑈(𝑟𝑟) = −𝑟𝑟2
H. 𝑈𝑈(𝑟𝑟) = √𝑟𝑟
I. 𝑈𝑈(𝑟𝑟) = −√𝑟𝑟
J. Irrational preferences
2) 𝑃𝑃1 ≺ 𝑃𝑃2 ≺ 𝑃𝑃3
A. 𝑈𝑈(𝑟𝑟) = 0
B. 𝑈𝑈(𝑟𝑟) = 3
C. 𝑈𝑈(𝑟𝑟) = 𝑟𝑟
D. 𝑈𝑈(𝑟𝑟) = 2𝑟𝑟 + 4
E. 𝑈𝑈(𝑟𝑟) = −𝑟𝑟
F. 𝑈𝑈(𝑟𝑟) = 𝑟𝑟2
G. 𝑈𝑈(𝑟𝑟) = −𝑟𝑟2
H. 𝑈𝑈(𝑟𝑟) = √𝑟𝑟
I. 𝑈𝑈(𝑟𝑟) = −√𝑟𝑟
J. Irrational preferences
3) 𝑃𝑃1 ≻ 𝑃𝑃2 ≻ 𝑃𝑃3
A. 𝑈𝑈(𝑟𝑟) = 0
B. 𝑈𝑈(𝑟𝑟) = 3
C. 𝑈𝑈(𝑟𝑟) = 𝑟𝑟
D. 𝑈𝑈(𝑟𝑟) = 2𝑟𝑟 + 4
E. 𝑈𝑈(𝑟𝑟) = −𝑟𝑟
F. 𝑈𝑈(𝑟𝑟) = 𝑟𝑟2
G. 𝑈𝑈(𝑟𝑟) = −𝑟𝑟2
H. 𝑈𝑈(𝑟𝑟) = √𝑟𝑟
I. 𝑈𝑈(𝑟𝑟) = −√𝑟𝑟
J. Irrational preferences
4) (𝑃𝑃1 ≺ 𝑃𝑃2 ≺ 𝑃𝑃3) ∧ (𝑃𝑃2 ≺ (50 − 50 lottery among 𝑃𝑃1 and 𝑃𝑃3))
A. 𝑈𝑈(𝑟𝑟) = 0
B. 𝑈𝑈(𝑟𝑟) = 3
C. 𝑈𝑈(𝑟𝑟) = 𝑟𝑟
D. 𝑈𝑈(𝑟𝑟) = 2𝑟𝑟 + 4
E. 𝑈𝑈(𝑟𝑟) = −𝑟𝑟
F. 𝑈𝑈(𝑟𝑟) = 𝑟𝑟2
G. 𝑈𝑈(𝑟𝑟) = −𝑟𝑟2
H. 𝑈𝑈(𝑟𝑟) = √𝑟𝑟
I. 𝑈𝑈(𝑟𝑟) = −√𝑟𝑟
J. Irrational preferences
5) (𝑃𝑃1 ≻ 𝑃𝑃2 ≻ 𝑃𝑃3) ∧ (𝑃𝑃2 ≻ (50 − 50 lottery among 𝑃𝑃1 and 𝑃𝑃3))
A. 𝑈𝑈(𝑟𝑟) = 0
B. 𝑈𝑈(𝑟𝑟) = 3
C. 𝑈𝑈(𝑟𝑟) = 𝑟𝑟
D. 𝑈𝑈(𝑟𝑟) = 2𝑟𝑟 + 4
E. 𝑈𝑈(𝑟𝑟) = −𝑟𝑟
F. 𝑈𝑈(𝑟𝑟) = 𝑟𝑟2
G. 𝑈𝑈(𝑟𝑟) = −𝑟𝑟2
H. 𝑈𝑈(𝑟𝑟) = √𝑟𝑟
I. 𝑈𝑈(𝑟𝑟) = −√𝑟𝑟
J. Irrational preferences
6) (𝑃𝑃1 ≺ 𝑃𝑃2) ∧ (𝑃𝑃2 ≺ 𝑃𝑃3) ∧ ((50 − 50 lottery among 𝑃𝑃2 and 𝑃𝑃3) ≺ (50 −
50 lottery among 𝑃𝑃1 and 𝑃𝑃2))
A. 𝑈𝑈(𝑟𝑟) = 0
B. 𝑈𝑈(𝑟𝑟) = 3
C. 𝑈𝑈(𝑟𝑟) = 𝑟𝑟
D. 𝑈𝑈(𝑟𝑟) = 2𝑟𝑟 + 4
E. 𝑈𝑈(𝑟𝑟) = −𝑟𝑟
F. 𝑈𝑈(𝑟𝑟) = 𝑟𝑟2
G. 𝑈𝑈(𝑟𝑟) = −𝑟𝑟2
H. 𝑈𝑈(𝑟𝑟) = √𝑟𝑟
I. 𝑈𝑈(𝑟𝑟) = −√𝑟𝑟
J. Irrational preferences
7) Which of the following would be a utility function for a risk-seeking preference?
That is, for which utility(s) would Pacman prefer entering a lottery for a random
food pellet, with expected size 𝑠𝑠 over receiving a pellet of size 𝑠𝑠?
A. 𝑈𝑈(𝑟𝑟) = 0
B. 𝑈𝑈(𝑟𝑟) = 3
C. 𝑈𝑈(𝑟𝑟) = 𝑟𝑟
D. 𝑈𝑈(𝑟𝑟) = 2𝑟𝑟 + 4
E. 𝑈𝑈(𝑟𝑟) = −𝑟𝑟
F. 𝑈𝑈(𝑟𝑟) = 𝑟𝑟2
G. 𝑈𝑈(𝑟𝑟) = −𝑟𝑟2
H. 𝑈𝑈(𝑟𝑟) = √𝑟𝑟
I. 𝑈𝑈(𝑟𝑟) = −√𝑟𝑟