Trump, Cruz, Rubio: The Game Theory of When The Enemy of Your Enemy Is Your Enemy.

I posted earlier about truels and how the current GOP nomination approximates one.  In that post, I laid out the basics of the simple truel (i.e., a three person duel), assuming that the three shooters shoot sequentially.  Things can be different when the three shooters shoot simultaneously.[1]  Short version: Trump and Rubio aren’t allies, but game theory suggests they should both attack Cruz, in spite of this.

This is arguably a better model for debates than the sequential version, in which candidates prepare extensively prior to debate, largely in ignorance of the other debaters’ preparations. Leaving that interesting question aside, let’s work this out.  I assume that the truel lasts until only one shooter is left, and that each shooter wants to live, and is otherwise indifferent.  I’ll also assume that the best shooter hits with certainty.[2] The probability that the second-best shooter hits his or her target is 0<p<1 and the probability that the worst shooter hit his or her target is 0<q<p.

When there are two shooters left, each will shoot at the other.  Not interesting, but important, because this implies that the worst shooter wants to shoot at the best shooter in the first round. In the first round, both the second-best and worst shooters shoot at the best shooter.  Either the first best or second best shooter will be dead after this (if the second-best and worst shooter each get to shoot before the first best shooter, but miss, then the second-best shooter will be killed with certainty). There is also a chance that the worst shooter will win in the first round: the best shooter kills the second-best shooter (probability 1/3), and the worst shooter kills the best shooter (probability q<1).

What does this say about the GOP race?  Both Rubio and Trump should be shooting at Cruz.  This is a simplistic model, and it ignores a lot of real-world factors.  But that’s why it’s valuable, from a social science perspective: if (and when) the behaviors of the three campaigns deviate from this behavior, we know that we need to include those other factors.  Until then, you see, in this world there’s two kinds of models, my friend: Those with just enough to capture the logic and those who need to dig for more things to include.  We’ll see if this one needs to dig.

With that, I leave you with this.

____________________

[1]. For simplicity, I will assume that, if two shooters shoot at each other, then one of them, randomly chosen, will “shoot first” and, if he or she kits, kill the other shooter before he or she fires his or her weapon.  Note that, with this assumption, if shooter A knows that shooter B (and only shooter B) is going to shoot at shooter A, then shooter A should definitely shoot at shooter B.

[2]This assumption isn’t as strong as it appears. This is because the truel is already assumed to continue until only one player is left (note that it is impossible for zero shooters to survive, given the tie-breaking assumption).

The GOP’s Reality is Truel, Indeed

truel is a three person duel.  There are lots of ways to play this type of thing, but the basic idea is this: three people must each choose which of the other two to try to kill.  They could shoot simultaneously or in sequence.  The details matter…a lot.  I won’t get into the weeds on this, but let’s think about the GOP race following last night’s Iowa caucus results.  By any reasonable accounting, there are three candidates truly standing: Ted Cruz, Marco Rubio, and Donald Trump.  The three of them took, in approximately equal shares, around 75% of the votes cast in the GOP caucus.

The next event is the New Hampshire primary, and the latest polls (all conducted before the Iowa caucus results) have Trump with a commanding lead and Rubio and Cruz essentially tied for (a distant) second.  So, the stage is set.  Who shoots first?  And at whom?

The truel is a useful thought experiment to worm one’s way into the vagaries of this kind of calculus.  A difference between truels and electoral politics is that the key factor in a standard truel is each combatant’s marksmanship, or the probability that he or she will kill an opponent he or she shoots at.  What we typically measure about a candidate is how many survey respondents support him or her.  For the purposes of this post, let’s equate the two.  Trump is the leader, and Rubio and Cruz are about equal.

A relatively robust finding about truels is that, when the shots are fired sequentially (i.e., the combatants take turns), each combatant should fire at the best marksman, regardless of what the other combatants are doing (this is known as a “dominant strategy” in game theory).  Thus, if we think that the campaigns are essentially taking turns (maybe as somewhat randomly awarded by the vagaries of the news cycle and external events), then both Rubio and Cruz should be “shooting at Trump.”  This is in line with Cruz’s post-caucus speech in Iowa last night.

An oddity of this formulation of the truel is that it is possible that the best marksman is the least likely to survive.  This is true even if the best marksman gets to shoot first.

Is it current, or future, popularity? An alternative measurement of marksmanship, however, is not the current support, but the perceived direction of change in support.  After all, marksmanship is about the ability to kill someone on the next shot.

On this front, Rubio is currently the better marksman: his support in Iowa vastly exceeded expectations, while by many accounts (though not necessarily my own), Trump is the worst marksman.  If one buys this alternative measure, then the smart strategy for both Trump and Cruz is to “aim their guns” at Rubio.  We have a week to see who they each aim at.

Of course, a truel is a simplistic picture of what’s going on in the GOP nomination process. In reality, it is probably better to think that each candidate’s marksmanship depends on his (or her) choice of target.  Evidence suggests that it is harder for Trump to “shoot down” Cruz than it was for him to shoot down Bush.  Maybe I’ll come to that later.  For now, I’m still making sense of Santorum’s strategy of heading to South Carolina. For that matter, I’m trying to make sense of him being called “a candidate for President.”

With that, I leave you with this.

The Patriots Are Commonly Uncommon

This is math, but it isn’t politics.  This is serious business.  This is the NFL.

The New England Patriots won the coin toss to begin today’s AFC championship game against the Denver Broncos. With that, the Patriots have won 28 out of their last 38 coin tosses. To flip a fair coin 38 times and have (say) “Heads” come up 28 or more times is an astonishingly rare event. Formally, the probability of winning 28 or more times out of 38 tries when using a fair coin is 0.00254882, or a little better than “1 in 400” odds.

But the occurrence of something this unusual is not actually that unusual. This is because of selective attention: we (or, in this case, sports journalists like the Boston Globe‘s Jim McBride) look for unusual things to comment and reflect upon. I decided to see how frequently in a run of 320 coin flips a “window” of 38 coin flips would come up “Heads” 28 or more times. I simulated 10,000 runs of 320 coin flips and then calculated how many of the 283 “windows of 38” in each run contained at least 28 occurrences of “Heads.” (For a similar analysis following McBride’s article, considering 25 game windows, see this nice post by Harrison Chase.)

The result? 441 runs: 4.41%, or a little better than “1 in 25” odds. (Also, note that the result would be doubled if one thinks that we would also be just as quick to notice that the Patriots had lost 28 out of the last 38 coin tosses.)

The distribution of “how many windows of 38” had at least 28 Heads, among those that contained at least one such window, is displayed in the figure below. (I omitted the 9,559 runs in which no such window occurred in order to make the figure more readable.)

Figure 1: How Many Windows of 38 Had At Least 28 Heads

Accounting for correlation. Inspired partly by Harrison Chase’s post linked to above, I ran a simulation in which 32 teams each “flipped against each other” exactly once (so each team flips 31 times), and looked at the maximum number of flips won by any team. This relaxes the assumption of independence used in both the first simulation and, as noted by Chase, the Harvard Sports Analysis Collective analysis linked to above. I ran this simulation 10,000 times as well. I counted how many times the maximum number of flips won equaled or exceeded 23, which is the number of times the Patriots won in their first 31 games of the current 38 game window (i.e., through their December 6th, 2015 game against the Eagles).

The result? In 1,641 trials (16.41%), at least one team won the coin flip at least 23 times.

The Effect of Dependence. Intuition suggests that accounting for the lack of independence between teams’ totals decreases the probability of observing runs like the Patriots’. To see the intuition, consider the probability two teams both win their independent coin flips: 25%, and then consider the probability both teams “win” a single coin flip: 0%.

My simulations bear out this intuition, but the effect is bigger than I suspected it would be. Running the same 10,000 simulations assuming independence, at least one team won the coin flip at least 23 times in 2,763 trials (27.63%).

The histograms for the maximum number of wins in each of the 10,000 simulations, first for the “team versus team dependent” case and the second for the “independent across teams” case, are displayed below.

Figure 2: Maximum Number of Coin Flip Wins by A Team in Round-Robin 32 Team League Season

Figure 3: Maximum Number of Wins Among 32 Teams Flipping A Coin 31 Times

Takeaway Message.  Of course, anything that occurs around 5% of the time is not an incredibly common occurrence, but it illustrates that, it’s not that unusual for something unusual to occur. For example, note that the NFC once won the Super Bowl coin toss 14 times in a row (Super Bowls XXXII to XLV), an event that occurs with probability 0.00012207, or a little worse than “1 in 8000” odds. And, of course, we recently saw a coin flip in which the coin didn’t flip.

An empirical matter: somebody should go collect the coin flip data for all teams.  One point here is that looking at one team probably makes this seem more unusual, and the first intuition about the math might suggest that we can simply gaze in awe at how weird this is.  But, upon reflection, we should remember that we often stop to look at weird things without noting exactly how weird they are.

____________________________

Notes.

1. The probability 0.00254882 in the introduction is obtained by calculating the CDF of the Binomial[38,0.5] distribution at 27, and then subtracting this number from 1.  A common mistake (or, at least, made by me at first) is to calculate the Binomial[38,0.5] distribution at 28 and subtract this number from 1. Because the Binomial is an integer valued distribution, that actually gives the probability that a coin would come up Heads at least 29 times. The difference is small, but not negligible, particularly for the point of this post (considering the probability of a pretty rare event occurring in multiple trials).
2. 320 flips is 20 years of regular season games. Not that the streak is constrained to regular season games. I like Chase Harrison’s number (247, the number of games Belichick had coached the Patriots at the time of his post) better, but I didn’t want to re-run the simulations.
3. The probability of this “notable” event is even higher if one thinks that the we would be paying attention to the event even if the Patriots had won only (say) 27 of the last 38 flips.
4. I did the simulations in Mathematica, and the code is available here.

One Thing Leads to Another: “Delaying“ DA-RT Standards to Discuss Better DA-RT Standards Will Be Ironic

In response to the concerns raised by colleagues (principally and initially in this petition, but see also Chris Blattman’s take and other responses from both sides), I wanted to clarify why I think that delaying implementation of the Journal Editors’ Transparency Statement (JETS) is a poorly thought out goal, one that will differentially disadvantage some scholars, particularly younger, less well-known scholars.

These Standards Are Already Being Implemented. To begin, and reiterate one of the arguments I made here a few days ago, journal editors already have the unilateral discretion to impose the kinds of policies that JETS is calling upon editors to implement. To wit, editors are already implementing policies along these lines. For example, see the submission/replication guidelines of the American Journal of Political Science, American Political Science Review, and the Journal of Politics, to name only three. These three vary in details, but they are consistent with JETS as they stand right now.

It’s Happening Anyway, Let’s Stay In Front of It.  The point is that the JETS implementation is already under way and, indeed, was underway prior to the drafting of JETS. The DA-RT initiative is simply providing a public good: a forum for exactly the conversations that the petition signers seek. (The individuals who have contributed time to the public good that is DA-RT, and their contributions, are described here.)

The Clarifying Quality of Deadlines. The “implementation of JETS” scheduled for January 2016 is best viewed as a moment of public recognition that we as a discipline need to continue the conversations. Editorial policies are not written in stone, after all. Thus I strongly believe that delaying the implementation of JETS will do nothing other than further muddy the waters for scholars. JETS is about recognizing and shepherding the movement towards more coherent and uniform procedures to increase the transparency of social science research. Delaying it will place scholars, particularly junior and less well-known scholars, at a disadvantage. This is because implementation of the JETS will give all scholars firmer ground to stand on when seeking clarification of the details of a journal’s replication and transparency requirements.

Clear Policies Level the Playing Field and Make Editors (more) Accountable. Furthermore, scholars will be able to publicly compare and contrast these procedures, allowing more judicious selection of research design, early preparation of justifications for requests for exemptions, and finally, a counterpoint for an editorial decision that is inconsistent with the standards of peer outlets. That is, if journal X decides that one’s research is sufficiently transparent and then journal Y decides otherwise, the transparency of those journals’ standards—which JETS aims to ensure are publicly available—will ensure that the journals’ standards are fair game for comparison and debate. This is the type of conversation sought by many of the petition signers I have spoken with. Implementation of JETS will push this conversation forward, whereas delay will simply retain the status quo of an incoherent bundle of idiosyncratic policies.

Will The Sun Rise on January 15, 2016? It is important to keep in mind that the implementation of the JETS statement will in most cases result in no new policy: journal editors have been setting and fine-tuning standards like these for decades. Rather, implementing JETS binds editors—like myself—more closely to the sought-after conversations about how best to achieve transparency in the various subfields and with respect to the various methodologies of our discipline.

In other words, implementation of JETS will empower scholars to demand more transparency and accountability from the editors of the 27 journals that have signed the statement.

With that, I leave you with this.

Responding To A Petition To Nobody (Or Everybody)

Hey, long time no see. While we’ve been apart, there’s arisen a bit of a dustup in my little corner of the world about the Data Access and Research Transparency (DA-RT) initiative. In a nutshell, DA-RT represents a movement to continued discussion, implementation, and fine-tuning of standards regarding how social science research is produced and shared amongst scholars and the broader community.

In (quite belated) response, this petition dated November 3rd, 2015, requests a delay in the implementation of “DA-RT until more widespread consultation can be accomplished at, for instance, the regional meetings this year, and the organized section meetings and panels and workshops at the 2016 annual meeting.”

With the background set, a disclosure/explanation is in order: I am a coeditor of the Journal of Theoretical Politics, and hence a co-signatory on the DA-RT Journal Editors’ Transparency Statement (JETS).  That’s basically why I’m writing this, particularly once one reads the petition twice and realizes that, its length and detail notwithstanding, it is entirely unclear to whom the petition is directed (other than “colleagues”).

In practical terms, is this a petition to

1. Journal editors?
2. Journal publishers?
3. Journals’ editorial boards?
4. Journal reviewers?
5. The governing bodies of the various political science associations?
6. Political scientists in general?

In the spirit of this blog and my own view of the world, I’ll be clear:

the absence of a clearly named target of the petition is absolutely and definitively telling: this is not a serious (or at least well thought-out) plea. Full. stop.

Delay, delay, delay.  Without impugning any of the signers of the petition, it is clear to me that the petition is classic and barely disguised foot-dragging. This petition, as drafted, will do nothing to further serious dialogue about the issues at hand. Rather, it draws a (sadly, frequently and unnecessarily drawn) line in the sand between quantitative and qualitative analyses in the social sciences.

Transparency is hard for everybody.  The petition states that “Achieving transparency in analytic procedures may be relatively straightforward for quantitative methods executed via software code.” Sure, it might be. But it need not be. Difficulties with implementing transparency are qualitatively common to all forms of analysis: formal, quantitative, and qualitative. Formal analysis can depend on methods, proofs, or arguments that are obscure or opaque even to many scholars. Along the same lines, both quantitative and qualitative methods can be difficult to convey in a parsimonious fashion. Finally, both quantitative and qualitative analyses can bring up questions about how to preserve anonymity of subjects, maintain incentives for the collection of new data (“embargoing”), etc.

Let’s keep talking…at, you know, some place and some time. Each of the above issues is difficult to deal with, of course. But rather than acknowledging this (clear) reality and putting something productive forward, the petition instead suggests that “we” should delay implementation

“until more widespread consultation can be accomplished at, for instance, the regional meetings this year, and the organized section meetings and panels and workshops at the 2016 annual meeting. Postponing the date of implementation will allow a discipline-wide consideration of the principles of data access and research transparency and how they should be put into practice.”

To understand why this is foot-dragging, note first this “Response by the DA-RT organizers to Discussions and Debates at the 2015 APSA Meeting” (henceforth “the Response”). Seriously, if you’re already here in this post, you should take the time to read it. It’s not that long, but it’s got a lot of information.

Finished reading it?  Good.  Let’s move on to what I think is the money shot of the Response, and it’s adroitly situated right in the opening:

At the 2015 Annual Meeting of the American Political Science Association in San Francisco, DA-RT and JETS were a central topic at several meetings. There were multiple workshops, roundtables, and ad hoc discussions. In addition, transparency was debated at several of the organized section business meetings. As a result, conversations about openness took place on almost every day of the Annual Meeting. As facilitators of a now five-year long dialogue on openness, we were of course delighted that the topic received such a wide airing. (Emphasis added and doubled.)

All that said, the petition asks for more discussions: “discussions” that are neither organized nor even clearly described. Just a vague call for “let’s talk some more at some of those meetings that we’ll all be at in the next year or so.”

But, wait…to stop piling on and return to the facts as stipulated by both the Response and the petition itself: such discussions have been going on for the past 5 years.

Yes, it’s tough.  But the sky isn’t falling.  Look, both sides of the debate are filled by smart and well-meaning scholars.  Is the topic at hand—implementing the right kind(s) of transparency in research—a hard task?     Yes.    …And all involved acknowledge that, even if only because denying it would be ridiculous.

Any Good Transparency Standard Requires and Relies Upon Context. Why is this a hard task? Because there’s no perfect answer. Transparency is a beguiling concept, especially to scholars. To beguile implies at least a strong possibility of deception (which is ironic) and the allure of transparency fits this bill, precisely because “transparency” is like obscenity: you know it when you see it, because when you see it, you can account for the context. If a statue of a nude person is made of marble, it’s totally okay: not obscene. If you withhold data because the IRB (or contract, law) requires you to do so, or because revealing it would put people in harm’s way, that’s okay: still transparent. Just tell the editor(s) and reviewers (and, by extension) readers why.  This is a collaborative enterprise, this search for knowledge and betterment.  In the end, we’re in this together.

Look, This Ain’t A Democracy.  Finally, and I think most importantly, note that editors can and do impose policies about topics like this. Simply put, the petition is silly because journals and their editors do (and should) have discretion: that’s why we don’t have one big “JOURNAL OF RESEARCH” that everybody publishes in.

More specifically, and as the Response states,

It is important to note that JETS does not create new powers for journal editors. Instead, it asks them to clarify or articulate decisions they are already making or attempting to manage. Journal editors have had, and will continue to have, broad discretion to choose what they will and will not publish and their basis for doing so. (Emphasis added…twice.)

This isn’t about quantitative versus qualitative.  The petition draws a false, and all too commonly drawn, line in the sand.  The Response—and clear thinking—makes clear that neither the issue of transparency nor reproducibility differentially impinges on scholars due to the nature of their data or their method.  Data is data, method is method.  Sure, the implementation details of how best to achieve transparency will vary from one study to another—but this is based on the subject, not the nature of the data or method.  A method is something that can be done…you know…methodically.  That doesn’t require numbers.  Write down your method.  Share your data to degree that is legally and ethically possible.  Stop being fearful.  If none of that works, ask the editor for an exception.  If all of those steps fail…publish it somewhere else.  You can be like John Fogerty, Trent Reznor, or Prince.

This petition is cynical.  In the end, there’s no fire in that barn: somebody else is just blowing a lot of smoke from behind it. The petition is a manipulative force both playing upon and probably driven by fear.  Hopefully either the Response or maybe even even this post makes clear that this fear is unwarranted.

In the end, “haters gonna hate,” and, as a corollary, “editors gonna reject.”

Neither the DA-RT initiative, nor the petition, will change either of those truisms.

With that, I leave you with this.

Super PAC (Bites) Man

Rick Perry’s campaign seems to be a little strapped for cash.  But, his super PACs have plenty of money. What gives?  Is this just bad management, or possibly a systemic regularity tied to the hot mess that is the race for the GOP presidential nomination?

It’s no secret that super PACs have changed the nature of the (early) election cycle.  They are currently taking in over 80% of the campaign contributions.  While this disparity is understandable (super PACs can accept unlimited donations from a single individual, whereas candidates can essentially accept no more than $5400 from any individual and$5000 from any PAC—see here), it is nonetheless striking.

Though the super PACs are well-funded, Perry’s support to date is apparently quite narrow.  Some have interpreted this as a problem with/for Perry, with which I don’t disagree, but I want to forward a different story.  Namely, I think that the narrowness of that support is at least possibly by design.  Not by Perry’s design, but rather by goals of the donors.

Super PACs are easily created and highly flexible.  They work by spending to directly affect elections, and though the ones discussed in the current media cycle are associated “with” a particular candidate, they are not bound to hold true to that association.  More importantly, as Perry’s current situation lays bare, it is actually fairly difficult for a super PAC to step in and bail out, even indirectly, a flagging campaign.  This is because of the 120 day “cooling off period” (see here) that the FEC requires before a former employee of a campaign can be “involved with independent expenditures” (e.g., hired by a super PAC). This arms-length restriction bolsters the independence of the super PACs—from the candidate(s) with which they are associated—and solidifies the sway held by a super PAC’s mega-donor.

The proliferation of super PACs is probably contributing to the bulge of GOP candidates, but the real impact of the change is not that big money is “taking over” politics.  Rather, the new wild, wild west of campaign finance has lowered the cost of entry into an all-pay auction of sorts: the evidence is clearly consistent with a story of “lots” of rich people seeking influence over the election, but the more interesting story is how these mega-donors are seeking it.  Mostly, they aren’t bidding for the same candidate’s attention.  Instead, they are jump-starting “new” campaigns.  While this might seem to imply that these mega-donors are trying to buy “their own man” into the White House, I think that it is actually better thought of as a branding strategy.  Right now, Perry’s super PACs are deploying staff and ads in Iowa (see here, for example).  Perry is polling horribly among GOP voters in Iowa (less than 1% in today’s poll—see here).  Why spend the money here?  Why spend the money on Perry at all?  Because if it works even a little, these mega-donors—and their super PAC organizations—will have more leverage bargaining with the real contenders for the nomination.

Spending money on Perry in Iowa has a great “upside” for the super PACs in terms of demonstrating their effectiveness.  Perry’s rise, if it occurs, will

1. Look dramatic—if he polls at 2%, then his support will have doubled.
2. Be nearly solely attributable to the super PACs, because nobody else is fighting for Perry.[1]

Together, Rick Perry is kind of like Atari or Polaroid—brand names that have positive name recognition but are available on the cheap—and presents a great opportunity for a mega-donor (and his or her campaign staff) to demonstrate their expertise, build their clout.  Running a campaign is hard, and the proliferation of mega-donors lays bare something that political scientists have known for a long time: money is a necessary, but not sufficient, condition for electoral success.  There’s “plenty” of billionaires who love attention and care about politics.  But, by definition, there are precious few “top campaign organizations.” Electoral politics is a competitive sport, and what matters is not how much money or talent you have, but how much more you have than your competitors.

If you’re depressed by the money in politics, take heart: there are two awesome parts of this take on the new reality.  First, these mega-donors are (at least partially) throwing their money around fighting one another. Second, the people being played hardest are the megalomaniacal politicians who are spending (a lot of) their own time running essentially “trial balloon campaigns.”  In other words, while super PACs might have at first seemed like a boon for candidates who sought relief from the constant need to raise money in relatively small increments from lots of donors, it seems now that they have the potential to eat exactly those candidates by being

1. infinitely lived,
2. legally untied to any specific campaign, and
3. operationally having a “120-day cooling off period” barrier to insulate themselves.

Super Pac-Man came out 33 years ago, the second sequel to Pac-Man.  Quoting wikipedia, the link may be deeper than simply nomenclature:

[Super Pac-Man’s] new gameplay mechanics were considered by many to be confusing, and too much of a change from the original two games. In particular, when Pac-Man transforms into Super Pac-Man, he was thought by some to be much more difficult to control.

Life imitates art, perhaps.  With that, I leave you with this.

[1] In addition, Perry’s campaign clearly isn’t going to be credited with any bump in the poll numbers, because it’s broke.  That raises other moral hazard problems (super PAC for candidate X might want to starve candidate X’s campaign) that are interesting, but I’ll leave them to the side for now.

This Thursday, At 10, FOX News Is Correct

FOX News just announced the 10 candidates who will participate in the first primetime Republican presidential primary debate on August 6, 2015. The top 10 were decided by these procedures.  Given that FOX is arguably playing a huge role in the free-for-all-for-the-GOP’s-Soul that is that race for the 2016 GOP presidential nomination, it is important to consider whether, and to what degree, FOX News “got it right” when they chose “10” as the size of the field. Before continuing, kudos to FOX News for playing this difficult game as straight as possible: the procedures are transparent and simple. Thoughthey have ineradicable wiggle room and space for manipulation, I really think this was an example of how to make messy business as clean as possible.  That said, let’s see how messy it turned out…

In order to gauge how important procedures were in this case, I examined the past 10 polls (data available here) to ascertain, in any given poll, who was in the top 10.[1]  The results are pretty striking in their robustness.  In spite of there being 19,448 ways to pick 10 from 17, the top 10 candidates in the final poll were in the top 10 of each of the 10 polls in 96 cases out of a possible 100.  Furthermore, in no poll was more than one of the chosen participants outside of the top 10.  Thus, there were 4 polls in which one of the debate participants was not ranked in the top 10, and 2 of these were the oldest pair in the series.

More tellingly, perhaps, is the fact that the smallest consistent “non-trivial debate group”—the smallest group of candidates that never polled at less than the size of the group in the 10 polls—is 3: Donny Trump, Jeb Bush, and Scott Walker composed the top 3 of each of the last 10 polls (that’s actually true of the last 15 polls).[2]

While I often like to be contrary in these posts, and I thought I might have an opportunity here, I have to say that, in the end, FOX News got this one right—the only direction to go in terms of tuning the size of the debate would have been down (to either 8 or 3, but I will leave 8 for a different post).  Given that logistics are the only real reason for a media outlet[3] to putatively and presumptively winnow the field of candidates in an election campaign, FOX News was, in my opinion (and possibly by luck), correct in setting the number at 10.

And, with that, I leave you with this.

______

[1] The oldest of these concluded two weeks ago, on July 20th.

[2] The reason I refer to a non-trivial debate group is that Donald Trump composes the smallest consistent debate group: he has held the number 1 spot in the past 16 polls. I will leave to the side the question of whether Trump debating himself would be informative or interesting.  I just don’t know if he is enough of a master debater, though I suspect that he loves to master debates.  Who doesn’t?

[3] Oh, yeah, I forgot to mention that Facebook is involved with organizing the debate. See what I did there?!?

The True Trump Card: You Can’t Buy Credibility

The rise of mega-donors has been an important storyline in the unfolding drama of the 2016 presidential election (for example, see here).  The presence of these donors in the political game (or at least their visibility) is partially the result of the Supreme Court’s decision in Citizens United.  But more interesting is whether the rise of these mega-donors has caused the explosion of seemingly viable (mostly Republican) contenders for the 2016 election.

The argument that Citizens United has caused the explosion in candidates is admittedly appealing.  As Steven Conn describes this argument in the Huffington Post,

Citizens United has created a new dynamic within the Republican Party. Call it the politics of plutocratic patrons, and at the moment it is causing the GOP to eat itself alive.

Continuing, Conn notes that the argument

works something like this: With the caps lifted on spending, any candidate who can find a wealthy patron can make a perfectly credible run at the nomination.

I’ve added the underline because this is where “the math” gets interesting.  If by perfectly credible, one means, “capable of spending lots of money,” then yes, I agree.  That was actually always true: the right of an individual (i.e., a “wealthy patron”to buy advertisements for any political issue/candidate has never been effectively curtailed.  Rather, the right of individuals to contribute without limit to organizations that can then do so has been, in fits and starts, regulated.

More importantly, though, the fact that anyone can do so now does not mean that wealthy patrons can guarantee that any candidate can make a “perfectly credible run” at the nomination.  As Conn notes, Foster Freiss is bankrolling Santorum’s 2016 bid.     …Does anyone think that Rick Santorum is a perfectly credible candidate for the GOP nomination?

Maybe Foster Freiss.

No, Rick Santorum is not going to win the GOP nomination.   Neither is Rick Perry. Neither is Chris Christie.  Neither is Carly Fiorina.  Neither is Bobby Jindal.  Of course, I might be wrong on any one of those five.  But I will assuredly be right on at least four.  In fact, if I wanted to type enough, I could be right about no fewer than 15 people who are currently running for the GOP nomination not winning it. (Evidence?  For the latest, see here.)

Simply put, if there are 16 “perfectly viable” candidates for the GOP nomination, then I’m throwing my hat in the ring, too.  WHY NOT?

Look, a wealthy donor can get you in the media.  That is easy, to be honest, if you have the money.  To be a credible candidate, you have to have a chance of winning.  Only one can win.  Lots can spend.  In social science, we often describe this kind of competition as an “all-pay auction.” In an all-pay auction, the highest bidder gets the prize after paying his or her bid.  All of the other bidders pay their bids and don’t get a prize.  It is a stinky, foul game.  (Kind of like running for the presidency.)

In the mega-donor world, the donors are now the bidders, and we are to believe that they want to create viable candidates through their monies spent.  But this is at odds with two points, one empirical and one theoretical.  The empirical point is that these mega-donors are often successful investors and businesspeople.  The theoretical point is that, when there is a single prize, the all-pay auction should not generally see any positive bid from more than two bidders.[1]

These mega-donors have the real-world experience to understand the theoretical point.  …So what are they thinking?

Aside from misunderstanding the game (which can not explain all of the 14 or so “out of equilibrium donors” under the simplistic all-pay model), there are two immediate explanations.  The first is vanity: these donors want to play with the “big kids,” have a roll in the hay with the DC cognoscenti, etc.  While I think that’s obviously got some purchase, it is both unsatisfying and seems too simple for billionaires.

Accordingly, the second is that some or all of these donors are playing the long game with the real contenders.  You see, what the all-pay auction analogy to multicandidate elections misses (among assuredly other things) is that the auction is actually for multiple prizes—each person’s vote is (slightly) differential in value to the bidder, because if it is not bought by me then it might go to various different candidates.

To make this concrete, suppose for simplicity that a donor supported some new candidate, “Charlie,” with money spent in a way that bought a bunch of votes exclusively from nativist (anti-immigration-reform) voters.  That would hurt some GOP candidates (such as Donny Trump, who is anti-immigration-reform) more than others (such as Jeb Bush). If I, as a mega-donor, am in favor of Trump not winning the nomination, supporting Charlie might be much more effective in the multicandidate, winner-take-all game of the GOP nomination fight than simply handing that same money to Jeb. (This is because I could take votes away from Trump—for Charlie—that Jeb could not steal away himself, thus causing Jeb to win because Trump loses votes.  This is another instance of the Gibbard-Satterthwaite Theorem.)

As Conn describes the picture, I completely agree with the main point: Citizens United might very well have unleashed a beast upon the GOP hierarchy (at least for now), because it is harder for the party establishment to control mega-donors, who can now be solicited for “simple checks” by super-PACs and 527 groups.  But, I disagree that this is because the new system increases the realm of “viable candidates.”  Rather, it simply lowers the prices of diversion, smoke, and mirrors in the nomination game.

Is that good or bad?  I’ll defer for now, but I’m perfectly willing to say that it’s neither.  It just changes the game—in the end, money matters, but votes matter more.  In other words, to paraphrase Mencken, though the ways may vary according to the institutional details, donors and voters will invariably get the government they want, and they’ll get it good and hard.

With that, I leave you with this.

__________

[1] This is a blog post, so I’m being quick about this.  But the basic idea is that the contestants have some common beliefs about their (generally differing) levels of resources (or valuations of winning) and, with few exceptions, the bidder who is capable and willing to pay the third-highest (or lower) price for the prize will not bid because he or she will not willingly sustain a bid that would win in equilibrium.

In Comes Volatility, Nonplussing Both Fairness & Inequality

You know where you are?
You’re down in the jungle baby, you’re gonna die…
In the jungle…welcome to the jungle….
Watch it bring you to your knees, knees…
– Guns N’ Roses, “Welcome to the Jungle”

It’s a jungle out there, and even though you think you’ve made it today, you just wait…poverty is more than likely in your future…BEFORE YOU TURN 65!  Or at least that’s what some would have you believe (for example, here, here, and here).

In a study recently published on PLoS ONE, Mark R. Rank and Thomas A. Hirschl examine how individuals tended to traverse the income hierarchy in the United States between 1968 and 2011. Rank and Hirschl specifically and notably focus on relative income levels, considering in particular the likelihood of an individual falling into relative poverty (defined as being in bottom 20% of incomes in a given year) or extreme relative poverty (the bottom 10% of incomes in a given year) at any point between the ages of 25 and 60.  To give an idea of what these levels entail in terms of actual incomes the 20th percentile of incomes in 2011 was $25,368 and the 10th percentile in 2011 was$14,447. (p.4)

A key finding of the study is as follows:

Between the ages of 25 to 60, “61.8 percent of the American population will have experienced a year of poverty” (p.4), and “42.1 percent of the population will have encountered a year in which their household income fell into extreme poverty.” (p.5)

I wanted to make two points about this admirably simple and fascinating study.  The first is that it is unclear what to make of this study with respect to the dynamic determinants of income in the United States.  Specifically, I will argue that the statistics are consistent with a simple (and silly) model of dynamic incomes.  I then consider, with that model as a backdrop, what the findings really say about income inequality in the United States.

A Simple, Silly Dynamic Model of Income.  Suppose that society has 100 people (there’s no need for more people, given our focus on percentiles) and, at the beginning of time, we give everybody a unique ID number between 1 and 100, which we then use as their Base Income, or BI. Then, at the beginning of each year and for each person i, we draw an (independent) random number uniformly distributed between 0 and 1 and multiply it by the Volatility Factor,  which is some positive and fixed number.  This is the Income Fluctuation, or IF, for that person in that year: that person’s income in that year is then

$\text{Income}_i^t = \text{BI}_i^t + \text{IF}_i^t$.

In this model, each person’s income path is simply a random walk (with maximum distance equal to the Volatility Factor) “above” their Baseline Income.  If we run this for 35 years, we can then score, for each person i, where their income in that year ranked relative to the other 99 individuals’ incomes in that year.

I simulated this model with a range of Volatility Factors ranging from 1 to 200. [1]  I then plotted out percentages analogous to those reported by Rank and Hirschl for each Volatility Factor, as well as the percentage of people who spent at least one year out of the 35 years in the top 1% (i.e., as the richest person out of the 100).  The results are shown in Figure 1, below.[2]  In the figure, the red solid line graphs the simulated percentage of individuals who experienced at least one year of poverty (out of 35 years total), the blue solid line does the same for extreme poverty, and the green solid line does this for visiting the top 1%.  The dotted lines indicate the empirical estimates from Rank and Hirschl—the poverty line is at 61.8%, the extreme poverty line at 42.1% and the “rich” line at 11%.[3]

Figure 1. Simulation Results

Intuition indicates that each of these percentages should be increasing in the Volatility Factor (referred to equivalently as the Volatility Ratio in the figure)—this is because volatility is independent across time and people in this model: more volatility, the less one’s Base Income matters in determining one’s relative standing.

What is interesting about Figure 1 is that the simulated Poor and Extremely Poor occurrence percentages intersect Rank and Hirschl’s estimated percentages at almost exactly the same place—a volatility factor around 90 leads to simulated “visits to poverty and extreme poverty” that mimic those found by Rank and Hirschl.  Also interesting is that this volatility factor leads to slightly higher frequency of visiting the top 1% than Rank and Hirschl found in their study.

Summing that up in a concise but slightly sloppy way: comparing my simple and silly model with real-world data suggests that (relative) income volatility is higher among poorer people than it is among richer people.  … Why does it suggest this, you ask?

Well, in my simple and silly model, and even at a volatility factor as high as 90, the bottom 10% of individuals in terms of Base Income can never enter the top 1%.  At volatility factors greater than 80, however, the top 1% of individuals in Base Income can enter the bottom 20% at some point in their life (though it is really, really rare).  Individuals who are not entering relative poverty at all are disproportionately those with higher Base Incomes (and conversely for those who are not entering the top 1% at all).  Thus, to get the “churn” high enough to pull those individuals “down” into relative poverty, one has to drive the overall volatility of incomes to a level at which “too many” of the individuals with lower Base Incomes are appearing in the rich at some point in their life.  Thus, a simplistic take from the simulations is that (relative) volatility of incomes is around 85-90 for average and poor households, and a little lower for the really rich households. (I will simply note at this point that the federal tax structure differentially privileges income streams typically drawn from pre-existing wealth. See here for a quick read on this.)

Stepping back, I think the most interesting aspect of the silly model/simulation exercise—indeed, the reason I wrote this code—is that it demonstrates the difficulty of inferring anything about income inequality or truly interesting issues from the (very good) data that Rank and Hirschl are using.  The reason for this is that the data is simply an outcome.  I discuss below some of the even more interesting aspects of their analysis, which goes beyond the click-bait “you’ll probably be poor sometime in your life” catchline, but it is worth pointing out that this level of their analysis is arguably interesting only because it has to do with incomes, and that might be what makes it so dangerous.  It is unclear (and Rank and Hirschl are admirably noncommittal when it comes to this) what one should–or can—infer from this level of analysis about the nature of the economy, opportunity, inequalities, or so forth.  Simply put, it would seem lots of models would be consistent with these estimates—I came up with a very silly and highly abstract one in about 20 minutes.

Is Randomness Fair? While the model I explored above is not a very compelling one from a verisimilitude perspective, it is a useful benchmark for considering what Rank and Hirschl’s findings say about income inequality in the US.  Setting aside the question of whether (or, rather, for what purposes) “relative poverty” is a useful benchmark, the fact that many people will at some point be relatively poor during their lifetime at first seems disturbing.  But, for someone interested in fairness, it shouldn’t necessarily be.  This is because relative poverty is ineradicable: at any point in time, exactly 20% of people will be “poor” under Rank and Hirschl’s benchmark.[4]  In other words, somebody has to be the poorest person, two people have to compose the set of the poorest two people, and so forth.

Given that somebody has to be relatively poor at any given point in time, it immediately follows that it might be fair for everybody to have to be relatively poor at some point in their life: in simple terms, maybe everybody ought to share the burden of doing poorly for a year. Note that, in my silly model, the distribution of incomes is not completely fair.  Even though shocks to incomes—the Income Fluctuations—are independently and randomly (i.e., fairly) distributed across individuals, the baseline incomes establish a preexisting hierarchy that may or may not be fair.[5] For simplicity, I will simply refer to my model as being “random and pretty fair.”

Of course, under a strong and neutral sense of fairness, this sharing would be truly random and unrelated to (at least immutable, value neutral) characteristics of individuals, such as gender and race.  Note that, in my “random and pretty fair” model, the heterogeneity of Base Incomes implies that the sharing would be truly random or fair only in the limit as the Volatility Factor diverges to $\infty$.

Rank and Hirschl’s analysis probes whether the “sharing” observed in the real world is actually fair in this strong sense and, unsurprisingly, finds that it is not independent:

Those who are younger, nonwhite, female, not married, with 12 years or less of education, and who have a work disability, are significantly more likely to
encounter a year of poverty or extreme poverty.
(pp.7-8)

This, in my mind, is the more telling takeaway from Rank and Hirschl’s piece—many of the standard determinants of absolute poverty remain significant predictors of relative poverty.  The reason I think this is the more telling takeaway follows on the analysis of my silly model: a high frequency of experiencing relative poverty is not inconsistent with a “pretty fair” model of incomes, but the frequency of experiencing poverty being predicted by factors such as gender and race does raise at least the question of fairness.

With that, and for my best friend, co-conspirator, and partner in crime, I leave you with this.

______________

[1]Note that, when the Volatility Factor is less than or equal to 1, individuals’ ranks are fixed across time: the top earner is always the same, as are the bottom 20%, the bottom 10%, and so forth.  It’s a very boring world.

[2]Also, as always when I do this sort of thing, I am very happy to share the Mathematica code for the simulations if you want to play with them—simply email me. Maybe we can write a real paper together.

[3] The top 1% percentage is taken from this PLoS ONE article by Rank and Hirschl.

[4] I leave aside the knife-edge case of multiple households having the exact same income.

[5] Whether such preexisting distinctions are fair or not is a much deeper issue than I wish to address in this post.  That said, my simple argument here would imply that such distinctions, because they persist, are at least “dynamically unfair.”

The Statistical Realities of Measuring Segregation: It’s Hard Being Both Diverse & Homogeneous

This great post by Nate Silver on fivethirtyeight.com prodded me to think again about how we measure residential segregation.  As I am moving from St. Louis to Chicago,[1] this topic is of great personal interest to me.  Silver’s post names Chicago as the most segregated major city in the United States, according to what one might call a “relative” measure.

Silver rightly argues that diversity and segregation are two related, but distinct, things.  To the point, meaningful segregation requires diversity: if a city has no racial diversity, it is impossible for that city to be (internally) segregated.  However, diversity of a city as a whole does not imply that the smaller parts of the city are each also diverse.  One way to distinguish between city-wide diversity and neighborhood-by-neighborhood diversity is by using diversity indices at the different levels of aggregation.  Silver does this in the following table.

Citywide and Neighborhood Diversity Indices, Fivethirtyeight.com

Citywide Diversity. For any city C, city C‘s Citywide Diversity Index (CDI) is measured according to the following formula:

$CDI(C) = 1 - \sum_{g} \left(\frac{pop^C_g}{Pop^C}\right)^2$,

where $pop^c_g$ is the number of people in group g in city C and $latex Pop^C$ is the total population of city C.  Higher levels of CDI reflect more even populations across the different groups.[2]

Neighborhood Diversity. For any city C, let N(c) denote the set of neighborhoods in city C, let $pop^n_g$ denote the number of people in group g in neighborhood n, and let $Pop^n$ denote the total population in neighborhood n.  Then city C’s Neighborhood Diversity Index (NDI) is measured as follows:

$NDI(C) = 1 - \sum_{g} \left(\frac{pop^C_g}{Pop^C}\right)\sum_{n}\frac{\left(pop^n_g\right)^2}{pop^C_c Pop^n}$.

In a nutshell, the NDI measures how similar the neighborhoods are to each other in terms of their own diversities.  Somewhat ironically, the ideally diverse city is one in which, viewed collectively, the neighborhoods are themselves homogenous with respect to their composition: they all “look like the city as a whole.”

(This turns out to be one of the central challenges to comparing two or more cities with different CDIs on the basis of the NDIs.  More on that below.)

A Relative Measure of Segregation. In order to account for both measures of diversity, Silver constructs the “Integration/Segregation Index,” or ISI.  The ISI measures how much more (e.g., Irvine) or less (e.g., Chicago) integrated the city is at the neighborhood level relative to how much integrated it “should” be, given its citywide diversity. This makes more sense with the following figure from Silver’s post.

Neighborhood Diversity Indices vs. Citywide Diversity Indices, Fivethirtyeight.com

Silver’s analysis basically uses the 100 largest cities in the US to establish an “expected” neighborhood diversity index based on citywide diversity index.[3] Then, Silver’s ISI is (I think) the size of the city’s residual in this analysis—this is the difference between the city’s neighborhood diversity index and the city’s “predicted” or “expected” neighborhood diversity index, given the city’s citywide diversity index.  Thus, Chicago is the most segregated under this measure because it “falls the farthest below the red line” in the figure above.

This is all well and good, though one could easily argue that the proper normalization of this measure would account for the city’s citywide diversity index, because the neighborhood diversity index is bounded between 0 and the citywide diversity index.  Thus, Baton Rouge or Baltimore might be performing even worse than Chicago, given their lower baseline, or Lincoln might be performing even better than Irvine, for the same reason.[4]

In any event my attention was drawn to this statement in Silver’s post:

But here’s the awful thing about that red line. It grades cities on a curve. It does so because there aren’t a lot of American cities that meet the ideal of being both diverse and integrated. There are more Baltimores than Sacramentos.

I assume that Silver is using the term “curve” in the colloquial fashion, as opposed to referring to the nonlinear regression model: Silver is stating that, because the ISI is measured relative to the expected value of the NDI calculated from real (and segregated) cities, the fact that cities with high CDI scores tend to underperform relative to cities with lower CDI scores.

As alluded to above, this result could be at least partly artifactual because cities with higher CDIs have more absolute “room” to underperform.  More interestingly, however, is to first consider what Silver is holding forth as “absolute performance.”  The 45 degree line in the figure above represents the “ideal” NDI to CDI relationship: any city falling on this line (as Lincoln and Laredo essentially do) is as diverse at the neighborhood level as it can be, given its CDI.  Note that any city with a CDI equal to zero (i.e., a city composed entirely of only one group) will hit this target with certainty.

That got me to thinking: cities with higher CDIs might have a “harder time” performing at this theoretical maximum.  The statistical logic behind this can be sketched out using an analogy with flipping a possibly biased coin and asking how likely a given set of say 6 successive flips will be representative of the coin’s bias.  If the coin always comes up heads, then of course every set of 6 successive flips will contain 6 heads, but if the coin is fair, then a set of six successive flips will contain exactly 3 heads and 3 tails only

$\binom{6}{3} \left(\frac{1}{2}\right)^6 =\frac{5}{16}$,

or 31.25% of the time.  Cities with higher CDI scores are like “fairer” coins from a statistical standpoint: they have a harder target to hit in terms of what one might call “local representativeness.”

To test my intuition, I coded up a simple simulation. The simulation draws 100 cities, each containing a set of neighborhoods, each of which has a randomly determined number of people in each of five categories, or “groups.”  I then calculated the CDI and NDI for each of these fake cities, plotted the NDI versus CDI as in Silver’s figure above, and also calculated a predicted NDI based on a generalized linear model including both $CDI$ and $CDI^2$.  The result of one run is pictured below.

Simulated NDI vs. CDI

What is important about the figure—which qualitatively mirrors Silver’s figure—is that it is based on an assumption of unbiased behavior—it is generated as if people located themselves completely randomly.[5] Put another way, the simulations assume that individuals can not perceive race.

So what?  Well, this implies two points, in my mind.

1. The “curve” described by Silver is not necessarily emerging because bigger and more diverse cities are somehow “more accepting” of local segregation than are less diverse cities.  Rather, from a purely statistical standpoint, diverse cities are being scored according to a tougher test than are less diverse cities.
2. Silver’s ISI index is better than it might appear at first, because I think the “red line” is actually, from a statistical standpoint, a better baseline/normative expectation than the 45 degree line.

The final point I want to make that is not addressed by my own analysis is that Silver’s measure takes as given (or, perhaps, leaves essentially unjudged) a city’s CDI.  Thus, to look better on the ISI, a city should limit its citywide diversity, which is of course ironic.

With that, I leave you with this.

_________________

[1] And prior to moving to St Louis, I lived in Boston, Pittsburgh, Los Angeles, Chapel Hill, NC, London, Durham, NC, and Greensboro, NC.

[2] The details are a bit murky (and that’s perfectly okay, given that it’s a blog post), but are alluded to here.

[3] The maximum level of CDI—the “most diverse score” possible—is $1-\frac{1}{\text{Number of Groups}}$.  Thus, this measure is problematic to use when comparing cities that have measured “groups” in different ways.

[4] For example, one could use the following quick and dirty normalization:

$\frac{ISI(C)}{CDI(C)}$.

[5] An implementation detail, which did not appear to be too important in my trials, is that the five groups have expected sizes following the proportions of White, Black, Hispanic, Native American, and Asian American census groups in the United States, respectively.  This leads to the spread of CDI estimates looking very similar to those in Silver’s analysis, with the predictable exception of some extreme outliers like Sacramento and Laredo.