From the Path: The Escape Hatch That Closes

The talk I’d come to Montreal for happened in a windowless hotel ballroom, and it opened with a small technical glitch. Behind Maggie the title slide read Revisiting Fairness Impossibility with Endogenous Behavior, and a live captioning system was transcribing the room along the bottom of the screen. As the chair read the title aloud, the caption came back with something close to a visiting fairness, in Endogenous. I wrote it down — not out of diligence, but because a machine had just handed me the post: an algorithm sorting speech into text, botching the name of a talk about what algorithms get wrong when they sort.

Maggie Penn at a hotel-ballroom podium at left, with the projected title slide 'Revisiting Fairness Impossibility with Endogenous Behavior' dominating the frame and a live auto-caption along the bottom.
Maggie, and the title slide the AI transcriber couldn’t quite make out.

Before the theorem, one artifact from the room, along with an apology for its condition. I couldn’t read the Wi-Fi sign from my seat, so I photographed it and zoomed in, which is why the picture below is soft: it is exactly as sharp as my eyes and the phone could manage between them, and no sharper. The password, once it resolved, was Montreal — in a room full of people who study how strangers coordinate, the one word everyone in the building would guess first. Squint and there’s a lesson in that. Don’t, and it’s a blurry photo of a password.

A conference Wi-Fi sign, photographed from a distance and slightly out of focus, showing the network FACCT2026 and the password Montreal.
Network: FACCT2026. Password: the answer nobody had to be told.

To the theorem. The result Maggie was there to complicate is one the regulars here have met before, so I’ll be quick and send newcomers to the fuller telling. A classifier sorting people into two bins — approve or deny, high-risk or low, audit or wave through — can be asked to be fair in more than one way, and two of the most natural ways collide. Ask that people who behave alike face the same odds of a favorable call regardless of group, and you’ve asked for error-rate balance. Ask that a given call mean the same thing no matter who receives it, and you’ve asked for predictive parity. Chouldechova, and independently Kleinberg, Mullainathan, and Raghavan, showed you generally can’t have both — not unless the groups share the same base rate of whatever you’re predicting, or your prediction is perfect. Barring those two cases, choosing one fairness means giving up the other.

A projected conference slide showing binary decisions and the two fairness criteria, error-rate balance and predictive parity, with a live auto-caption transcribing the speaker along the bottom.
The two criteria on the screen, and the machine still transcribing underneath.

Notice what every line of that takes for granted: that the base rates are simply out there, like rainfall, properties of a population the classifier reads and reacts to. The talk — Maggie‘s, and the paper behind it, ours — drops that assumption, and dropping it is the whole point. Picture two towns with interchangeable drivers. One writes tickets for revenue and tickets everyone it can; the other writes tickets for safety and tickets only speeders. Hold each driver’s temperament fixed and move them between towns, and they drive differently, because in the revenue town easing off the gas buys you nothing. The base rate of speeding isn’t a fact about the drivers. It’s a fact about the drivers and the algorithm that judges them, together, in equilibrium.

(Ed here. Four sentences ago it was “Maggie’s talk”; now it’s “the paper behind it, ours”; and by the end of the paragraph you’ll have narrated your own coauthored theorem from the third row like a man who wandered in off the street. I see the footwork.)

Arrow found the pitiless version of this in 1973. Picture an employer convinced that every woman is unqualified, hiring on that conviction, in a world where becoming qualified costs real effort. A woman facing him has no reason to pay the cost, since qualifying won’t get her hired — so she doesn’t, and his conviction comes true. He classifies women as unqualified and he is right in every case, a perfect classifier in equilibrium, flawlessly accurate and plainly unjust at once, because he manufactured the accuracy rather than finding it. The classifier didn’t discover a population. It produced one.

Once base rates are things you can move, the impossibility seems to offer an exit. If unequal base rates were the whole problem, equalize them: set the stakes of classification — the fine, the sentence, the loan terms — so the groups end up with the same base rate, and error-rate balance and predictive parity can hold together. Problem solved.

It isn’t solved. It’s moved — and if you read the first half of this trip, you already know the shape of what’s coming. Buying equal base rates generally means attaching different consequences to identical decisions: the same “approve,” the same “audit,” landing heavier on one group than on another. The inequality hasn’t left. It’s moved out of the distribution of outcomes and into the severity of what those outcomes cost, where it’s harder to see and no smaller. Push far enough and the stakes needed to level the base rates start to punish the very compliance you meant to reward, flattening the gradient until doing the right thing stops paying.

Then the floor moves. The base rates you leveled were leveled by the stakes you chose; change the stakes and the base rates change with them; so the point where both criteria hold isn’t a place you can walk to but one that shifts as you approach, because approaching is what moves it. The static picture can’t represent this — it’s a photograph of something that exists only while it’s running. That was the result on the screen behind Maggie: the escape hatch is real, and it can close behind you. The IRS-ICE episode I wrote about in the winter was one live instance — a classifier that changed the filing behavior it fed on, and so corroded the data it needed in order to be fair.

Which is where the two halves of the trip meet. In the old city I watched a referendum that was strategy-proof at the ballot precisely because the strategy had all been spent upstream, in the wording of the question. In the ballroom I watched a classifier that looks fair at the level of its error rates precisely because the inequality had been spent upstream, in the stakes it sets and the behavior it shapes. The setter picks the question the country answers; the designer picks the incentives the population meets. In both rooms the visible instrument — the clean yes-or-no, the accurate score — looks innocent for the same reason: someone moved the action to a place you weren’t looking, and the impossibility went with it. It always does.

A large steel ring sculpture on a Montreal plaza framing Mount Royal in the distance between office towers.
A frame decides what you see through it. So does an objective function.

Maggie’s reply — the constructive half — is to stop asking classifiers to match a statistic and start asking them to offer different groups the same behavioral bargain. Aligned incentives in place of equalized error rates: fairness located in what the algorithm invites people to do, not in the bookkeeping of what it got right. It’s an early move in a longer argument, and the book she and I are finishing for Russell Sage makes the full case. But watching her set it down in front of a room that builds these systems for a living, I got to see a hard idea land, and to sit there knowing my name is on the paper and none of the nerve it took to give the talk was mine.

With that, I leave you with this.

Notes

The talk was “Revisiting Fairness Impossibility with Endogenous Behavior,” presented at ACM FAccT 2026 in Montreal. The static impossibility — that error-rate balance and predictive parity cannot generally hold together across groups with unequal base rates, absent perfect prediction — is due to Alexandra Chouldechova (2017) and, independently, Jon Kleinberg, Sendhil Mullainathan, and Manish Raghavan (2016). The equilibrium employer who is accurate and unjust at once is Kenneth Arrow’s, from “The Theory of Discrimination” (1973).

The two-town speeding example and the model of an algorithm designer whose objectives shape equilibrium behavior are from Penn and Patty, “Classification Algorithms and Social Outcomes,” American Journal of Political Science (2025). The result that the equal-base-rates escape can close once base rates are themselves endogenous to the classifier is from Penn and Patty, “Algorithmic Fairness with Feedback,” arXiv:2312.03155 (2023); the fuller treatment, along with the aligned-incentives notion of fairness, is the subject of the Russell Sage Foundation book in progress.

Hit Me...