# Inside Baseball: The Off-The-Path Less Traveled

[This is an installment in my irregular series of articles on the minutiae of what I do, “Inside Baseball.”]

Lately I have been working on a couple of models with various signaling aspects.  It has led me to think a lot more about both “testing models” and common knowledge of beliefs.  Specifically, a central question in game theoretic models is: “what should one player believe when he or she sees another player do something unexpected?” (“Something unexpected,” here, means “something that I originally believed that the other player would never do.”)

This is a well-known issue in game theory, referred to as “off-the-equilibrium-path beliefs,” or more simply as “off-the-path beliefs.” A practical example from academia is “Professor X never writes nice letters of recommendation.  But candidate/applicant Y got a really nice letter from Professor X.”

A lot of people, in my experience, infer that candidate/applicant Y is probably REALLY good. But, from a (Bayesian) game theory perspective, this otherwise sensible inference is not necessarily warranted:

$\Pr[\text{ Y is Good }| \text{ Good Prof. X Letter }] = \frac{\Pr[ \text{ Y is Good \& Good Letter Prof. X Letter }]}{\Pr[ \text{ Good Prof. X Letter }]}$

By supposition, Prof. X never writes good letters, so

$\Pr[ \text{ Good Prof. X Letter }]=0$.

Houston, we have a problem.

From this perspective, there are two questions that have been nagging me.

1. How do we test models that depend on this aspect of strategic interaction?
2. Should we require that everybody have shared beliefs in such situations?

The first question is the focus of this post. (I might return to the second question in a future post, and note that both questions are related to a point I discussed earlier in this “column.”)  Note that this question is very important for social science. For example, the general idea of a principal (legislators, voters, police, auditors) monitoring one or more agents (bureaucrats, politicians, bystanders, corporate boards) generally depends on off-the-path beliefs. Without specifying such beliefs for the principal—and the agents’ beliefs about these beliefs—it is impossible to dictate/predict/prescribe what agents should do. (There are several dimensions here, but I want to try and stay focused.)

Think about it this way: an agent assigning zero-probability to an action in these situations, if the action is interesting in the sense of being potentially valuable for the agent if the principal’s beliefs after taking the action were of a certain form, is based on the agent’s beliefs about the principal’s beliefs about the agent in a situation that the principal believes will never happen. Note that this is doubly interesting because, without any ambiguity, the principal’s beliefs and the agent’s beliefs about these beliefs are causal.

Now, I think that any way of “testing” this causal mechanism—the principal’s beliefs about the agent following an action that the principal believes the agent will never take—necessarily calls into question the mechanism itself.  Put another way, the mechanism is epistemological in nature, and thus the principal’s beliefs in (say) an experimental setting where the agent’s action could be induced by the experimenter somehow should necessarily incorporate the (true) possibility that the experimenter (randomly) induced/forced the agent to take the action.

So what?  Well, two questions immediately emerge: how should the principal (in the lab) treat the “deviation” by the agent?  That’s for another post someday, perhaps.  The second question is whether the agent knows that the principal knows that the agent might be induced/forced to take the action. If so, then game theory predicts that the experimental protocol can actually induce the agent to take the action in a “second-order” sense.

Why is this? Well, consider a game in which one player, A, is asked to either keep a cookie or give the cookie to a second player, B. Following this choice, B then decides whether to reward A with a lollipop or throw the lollipop in the trash (B can not eat the lollipop).  Suppose also for simplicity that everybody likes lollipops better than cookies and everybody likes cookies better than nothing, but A might be one of two types: the type who likes cookies a little bit, but likes lollipops a lot more (t=Sharer), and the type who likes cookies just a little bit less than lollipops (t=Greedy).  Also for simplicity, suppose that each type is equally likely:

$\Pr[t=\text{Sharer}]=\Pr[t=\text{Greedy}]=1/2$.

Then, suppose that B likes to give lollipops to sharing types (t=Sharer) and is indifferent about giving lollipops to greedy types (t=Greedy).

From B’s perspective, the optimal equilibrium in this “pure” game involves

1. Player B’s beliefs and strategy:
1. B believing that player A is Greedy if A does not share, and throwing the lollipop away (at no subjective loss to A), and
2. B believing that A is equally likely to be a Sharer or Greedy if A does share, and giving A the lollipop (because this results in a net expected gain for B).
2. A’s strategy:
1. Regardless of type, A gives B the cookie, because this (and only this) gets A the lollipop, which is better than the cookie (given B’s strategy, there is no way for A to get both the cookie and lollipop).

Now, suppose that the experimenter involuntarily and randomly (independently of A’s type) forces A to keep the cookie (say) 5% of the time.  At first blush, this seems (to me at least) a reasonable way to “test” this model.  But, if the experimental treatment is known to B and A knows that B knows this, and so forth, then the above strategy-belief profile is no longer an equilibrium of the new game (even when altered to allow for the 5% involuntary deviations). In particular, if the players were playing the above profile, then B should believe that any deviation is equally likely to have been forced upon a Sharer as a Greedy player A.  Thus, B will receive a positive expected payoff from giving the lollipop to any deviator.  Following this logic just about two more steps, all perfect Bayesian equilibria of this “experimentally-induced” game is

1. Player B’s beliefs and strategy:
1. believes that player A is equally likely to be a Sharer or Greedy if A does not share, and thus giving A the lollipop
2. It doesn’t matter what B’s beliefs are, or what B does if A does share. (Thus, there is technically a continuum of equilibria.)
2. A’s strategy:
1. Regardless of type, A keeps the cookie, because this gets A both the cookie and lollipop).

By the way, this logic has been used in theoretical models for quite some time (dating back at least to 1982).  So, anyway, maybe I’m missing something, but I am starting to wonder if there is an impossibility theorem in here.