This blog does not usually do “journal club.” (Ed: huh? …who “does journal club?” Is it voluntary?) It’s not that I have anything against academic journals — I publish in them, edit one, and spend a disproportionate fraction of my waking life reading them — but their timelines and their intended audience are different enough from this blog’s that treating them as news hooks feels strained. A paper published in January that cites papers from 2016 and 2017 is not, by the usual definition, breaking.
But a paper published in January in Frontiers in Artificial Intelligence has been sitting in a tab I kept meaning to close since it appeared, and something happened this week that made me want to address it directly.1 The paper argues that current and proposed legislation mandating algorithmic fairness is built on an incoherent foundation — and it is right about this. It explains why the foundation is incoherent — and it is right about this too, as far as it goes. And then it proposes a fix. This is where things get interesting, because the fix does not fix the problem. It relocates it. And the reason the fix fails is a reason the paper does not see.
What the Paper Gets Right
The setup will be familiar to anyone who read yesterday’s post on classifiers, but let me restate it in the regulatory context.2 Over the past decade, jurisdictions at every level — the European Union, several U.S. states, and a handful of proposed federal bills — have introduced laws requiring that AI systems making consequential decisions about people be fair. These laws are admirably motivated. Their architects are right that algorithmic systems can encode and amplify discrimination, that the affected populations deserve protection, and that the existing civil rights framework was not designed with machine learning classifiers in mind.
The problem — and the Frontiers paper identifies it clearly — is that “fair” is not a single thing. The scholarly literature on algorithmic fairness has produced dozens of formal definitions, and the ones that actually appear in litigation and regulation cluster around a handful of core concepts: demographic parity (the classifier makes the same decision at the same rate across groups), predictive parity (the classifier is equally accurate across groups), error rate balance (false positive and false negative errors occur at equal rates across groups), and equalized odds (both error types are equalized simultaneously). These are all reasonable things to want. They are also, as the paper documents at length, mutually incompatible.
The formal result here is due to Chouldechova (2017) and, independently, to Kleinberg, Mullainathan, and Raghavan (2016): when two groups have different base rates of the outcome being predicted — when, say, two demographic groups are convicted of crimes at different rates, for whatever combination of social, historical, and structural reasons — no classifier can simultaneously equalize false positive rates across groups and maintain equal predictive accuracy across groups, except in the degenerate cases of perfect prediction or equal base rates. You can satisfy one condition; you cannot satisfy both. You can move the inequality around. You cannot make it disappear.
The Frontiers paper’s contribution is to bring this result into contact with the legislative landscape and ask: if the definitions are incompatible, and the laws don’t specify which one they mean, what exactly is the law requiring? The answer, uncomfortably, is something like: whatever the manufacturer and the plaintiff’s expert happen to argue in court, which is not a basis for consistent regulatory guidance. This is a genuinely important observation, and it is correct.
Prior Art
It would be ungracious to proceed without noting that the impossibility the Frontiers paper discusses has a longer history than the paper’s framing suggests — one that Maggie Penn and I have had a hand in, and that this blog, as of yesterday, has begun covering in earnest.
In 2014, Maggie and I published Social Choice and Legitimacy: The Possibilities of Impossibility (Cambridge University Press). The book’s central argument is that the classic impossibility results of Arrow, Gibbard, and Satterthwaite are not curiosities about elections, not proofs that democratic governance is futile, and — I say this with some awareness of how it sounds — not the dry historical artifacts that their usual framing suggests. (Ed: “A social choice book with ‘legitimacy’ in the title. I’m sure it flew off the shelves.” — It sold how it sold. The point is that it was right.) The results are structural facts about any attempt to aggregate competing criteria of collective judgment into a coherent standard — and they apply wherever that attempt is made: to voting, yes, but equally to measurement, to data analysis, to network centrality, to inflation indices, and, as it turns out, to the problem of specifying what algorithmic fairness means and who gets to decide.
When Maggie and I wrote that book, I will confess that I thought of it as primarily engaged with classical problems in democratic theory — the debates that had been running since Condorcet, Arrow, and Sen, updated for a more formal treatment. Those debates are important. They are also, or so I imagined, somewhat bounded: the world had largely worked out its institutional responses to the aggregation problem, imperfect as those responses are, and the book’s contribution was to show why the imperfections were structural rather than incidental. We were explaining the furniture of democratic life, not predicting new rooms.
The rooms keep arriving. The algorithmic fairness debate is, at its mathematical core, the same debate — conducted by people who, for the most part, do not know it. The question of which fairness definition to mandate is a social choice problem in exactly the sense the book develops: there are multiple criteria, each internally coherent, each with a constituency, and the task of choosing among them requires an aggregation procedure that is itself subject to the conditions Arrow proved no procedure can jointly satisfy. The book’s argument is not a contribution to the algorithmic fairness literature specifically, because the algorithmic fairness literature did not exist in 2014. It is a contribution to the underlying problem, which long predates classifiers and will long outlast them.3
Our work since then has extended this framework directly to classification algorithms — connecting the statistical definitions of fairness to the welfare-based notions people actually invoke when they use the word,4 and examining what happens to the impossibility when the classified population can respond to the classifier, which in the real world it always can. The Frontiers paper cites the Chouldechova and Kleinberg results that anchor this literature. It does not engage with ours. I note this not to be aggrieved — regulatory scholarship and formal theory have different citation orbits, and the paper is doing a different job — but because the part of our work the paper doesn’t engage with is exactly the part that explains why the paper’s proposed fix doesn’t work. That is the structural observation I want to make, and it runs through the rest of this post.
The Proposed Fix, and Why It Sounds Reasonable
The paper’s solution to the definitional incoherence problem is: pick a definition. Or rather, since a single mandatory definition may not suit all applications, develop a regulatory process — something like the FDA’s device approval process — through which jurisdictions coordinate on definitions appropriate to specific contexts. Employment discrimination calls for one standard, criminal risk assessment for another, mortgage lending for a third. The definitions are context-specific, established through deliberation with stakeholder input, and applied consistently thereafter.
This sounds like good regulatory design. And here I want to be precise about why it sounds that way — because the reason is not merely that it is sensible, intuitive, and modeled on a regulatory institution that works reasonably well. The reason is that this proposal follows, almost exactly, the template that Social Choice and Legitimacy identifies as the standard institutional response to an aggregation impossibility. Pure aggregation — just average the criteria, just let the algorithm pick — runs into the impossibility directly. The institutional response, the one democratic societies have developed over centuries of working around Arrow’s result, is to layer on top of pure aggregation a set of procedures, deliberative processes, and accountability mechanisms that allow decisions to be made and legitimated even when no aggregation rule can produce a uniquely correct answer. The FDA process is a canonical example of this template: you cannot derive the correct definition of “safe and effective” from the data alone, so you build an institution that deliberates, applies consistent procedures, and is held accountable for its outputs.
This is genuinely better than nothing. The book argues — and I believe — that procedures and accountability are not just workarounds for the impossibility but the substance of legitimate governance. The Frontiers paper’s proposal is, in this sense, the right kind of move. The question is not whether the move is reasonable. It is whether it solves the problem, or whether — as the book also argues — the impossibility relocates into the new institution rather than disappearing inside it.
The Problem Moves
The Chouldechova-Kleinberg impossibility is a static result. It says: given a fixed population with fixed base rates, a given classifier cannot simultaneously satisfy competing fairness conditions. The population is fixed, the base rates are fixed, and the impossibility is a fact about the mathematical structure of the prediction problem. This is genuine and important. It is also a result about a world that doesn’t exist.
The actual world contains people who know they are being classified and can change their behavior in response. This isn’t a contingent feature of some applications — it is a structural property of any classification system with real stakes. The people being assessed for recidivism risk know that their behavior will be scored. The job applicants whose résumés pass through a screening algorithm know that their credentials will be evaluated. The immigrants deciding whether to file taxes know that filing generates a record that may be used against them. In every consequential classification context — which is precisely the context the paper says it wants to regulate — the classified population is a strategic actor, not a fixed distribution of outcomes.
What happens to the impossibility when the population can respond? Maggie Penn and I show, in work that has been circulating since December 2023,4 that it survives — and in a specific sense gets worse. The static result says you cannot simultaneously satisfy condition A and condition B for a fixed population. The dynamic result says: even if you could satisfy both conditions at some initial moment, the classified population’s response to your classifier will shift the base rates, moving you off the point that satisfied both conditions, and the new point you need to reach to satisfy both conditions will itself be moved by the population’s response to your adjustment. You are chasing a target that your pursuit is moving.
The IRS-ICE data-sharing case that this blog covered in February is a real-world instance of exactly this dynamic.5 The decision to make tax data available for immigration enforcement didn’t just raise a static question about whether the resulting classifier was racially fair by some definition. It changed the filing behavior of the classified population, which changed the data the classifier operated on, which changed what the classifier could possibly be fair toward. The court records document the chilling effect. The base rates are different now than they were before the agreement. A regulatory standard set before the breach would be calibrated to the wrong population — not because it was poorly designed, but because the classifier changed the population it was designed to govern.
This is what it means for the impossibility to move. The paper’s fix — specify a definition, apply it consistently — assumes that specifying a definition gives you a stable target. But the target is only stable if the classified population doesn’t respond to the classifier. (Ed: “Which is to say: the target is stable in the laboratory and nowhere else.” — Correct.) In any system where the stakes are high enough to generate strategic behavior, the static impossibility understates the problem by assuming away the feature that makes the problem hard.
The Designer’s Objectives, and the Arrow Recursion
There is a second dimension the static framing misses, and it matters more for regulatory design than it first appears. The Chouldechova-Kleinberg result treats the classifier as a mathematical object — a function from inputs to outputs — and asks whether that function can satisfy multiple fairness criteria simultaneously. The paper inherits this framing. But in the real world, a classifier is not just a function. It is a policy instrument deployed by a designer with objectives, and those objectives shape the classifier’s equilibrium effects in ways the fairness definitions alone cannot capture.
Maggie and I make this precise in our work on classification algorithms and social outcomes:6 two designers with identical data, identical technology, and formally identical classifiers — in the sense that both satisfy the same static fairness criterion — can produce radically different outcomes for the classified population depending on what they are trying to achieve. A designer optimizing for accuracy produces a classifier that, at equilibrium, pushes some groups toward compliance and others away, exacerbating behavioral differences even when the data quality is equal across groups. A designer optimizing for compliance produces a classifier that, at equilibrium, pushes all groups in the same direction. The fairness criterion is the same in both cases. The equilibrium effects are not. Mandating a definition of fairness specifies what the output should look like. It does not specify what the designer is trying to do. And the designer’s objectives are what the classified population actually lives with.
And then there is the Arrow problem — which is not a separate observation but the central argument of Social Choice and Legitimacy, applied now one level up. The paper’s proposal to coordinate among definitions through a deliberative regulatory process is itself a social choice problem. How do you aggregate the preferences of competing stakeholders over which fairness definition to mandate? Each stakeholder group has coherent preferences. Each definition has a constituency. The regulatory process that selects among them is an aggregation procedure, and it is subject to exactly the conditions Arrow showed no aggregation procedure can jointly satisfy. You have not resolved the incommensurability of fairness definitions by creating a process to choose among them. You have created a new social choice problem with the same structure as the old one, now operating at the level of institutional design rather than algorithmic design. The impossibility relocates. It does not evaporate. This is what this blog has called conservation of impossibility — and it is, at bottom, the thesis of a book published in 2014 applied to a domain that did not exist when the book was written.7
None of this is an argument against the Frontiers paper’s regulatory agenda. Mandating that legislation specify which fairness definition it means is genuinely better than the status quo. The chaos the paper documents is real, and reducing it would help. My argument is narrower: the problem the paper has correctly diagnosed has a harder structure than the paper’s framing reveals, and the harder structure doesn’t admit the kind of fix the paper envisions. The static impossibility is the beginning of the story. The dynamic impossibility, the designer objectives problem, and the Arrow recursion are the next three chapters — and they’re the ones that make the regulatory question genuinely hard rather than merely complicated.
Congratulations on the theorem. The theorem has sequels.
With that, I leave you with this.
1 Greg Demirchyan, “Algorithmic Fairness: Challenges to Building an Effective Regulatory Regime,” Frontiers in Artificial Intelligence 8 (January 21, 2026), doi:10.3389/frai.2025.1637134. The something that happened this week: yesterday, I published Know When to Hold ‘Em, the first post in a series building toward a specific result about how accuracy-maximizing classifiers can behave. The COMPAS case appears in footnote 7 of that post and the Chouldechova-Kleinberg impossibility is stated there. The Frontiers paper and that post are addressing the same mathematical fact from different directions, and it seemed like a good moment to put them in conversation.
2 Know When to Hold ‘Em (or, “what is AI?”), April 2, 2026. The post covers what a classifier is, what a score and a threshold are, and how the three failure modes — bad thermometer, wrong temperature, confident misdiagnosis — differ. It is intended as a general audience entry point to the series; this post assumes some familiarity with that framework.
3 Penn and Patty, Social Choice and Legitimacy: The Possibilities of Impossibility (Cambridge University Press, 2014). The book develops the axiomatic method as a framework for understanding not just voting but the full range of aggregation problems social scientists face — measurement, data analysis, institutional design — and argues that the impossibility results are explanations of why legitimate governance requires procedures and accountability on top of pure aggregation, not indictments of governance as such. The Russell Sage Foundation book Maggie and I are currently completing extends this framework into the algorithmic domain at length; the present series of posts is, among other things, a preview of that argument. For the classifier-specific extension connecting statistical fairness definitions to welfare-based notions, see Penn and Patty, “Algorithmic Fairness with Feedback,” arXiv:2312.03155 (December 2023).
4 “Algorithmic Fairness with Feedback,” arXiv:2312.03155 (see footnote 3). The dynamic impossibility result derives from formalizing the classifier with an endogenous population response. The short version: the assumptions required for the dynamic impossibility to fail are the same assumptions required for the static impossibility to fail — perfect prediction or equal base rates — but in the dynamic setting, the base rates are themselves endogenous to the classifier’s operation. The equal-base-rates escape hatch can close even if it was initially open. The problem is not just harder than the static framing suggests. It is harder in a way the static framing cannot see.
5 The IRS Is Here to Help. So Is ICE., February 26, 2026. The post covers the IRS-ICE data-sharing agreement in full: the judicial record, the behavioral dynamics, the connection to the formal model, and the feedback loop through which a classifier that operates on a strategic population degrades its own inputs. If you haven’t read it, it is the empirical spine of the argument I’m making here.
6 Penn and Patty, “Classification Algorithms and Social Outcomes,” American Journal of Political Science (forthcoming). The formal results on designer objectives and their equilibrium effects are here. The aligned incentives result — that compliance-maximizing designers produce classifiers satisfying a meaningful behavioral neutrality across groups, while accuracy-maximizing designers do not — is, in my view, one of the more practically important results in the paper, and the one most directly relevant to the regulatory question the Frontiers paper is addressing.
7 All Measurements Are Local, April 1, 2026, works through the Arrow recursion in the context of CPI weighting — the BLS faces the same social choice problem in constructing a headline inflation number that legislators face in mandating a fairness definition, and for the same reasons. The underlying argument is from Social Choice and Legitimacy (2014), specifically the book’s treatment of Arrow’s conditions as applying to any coherent aggregation procedure, not just electoral ones. The impossibility is not a feature of elections. It is a feature of aggregation. A regulatory process that aggregates stakeholder preferences over fairness criteria is an aggregation procedure, and Arrow’s four conditions — Unrestricted Domain, Pareto, Independence of Irrelevant Alternatives, Non-Dictatorship — are all entirely reasonable to require of such a process, and are jointly incompatible with a guarantee of a coherent output. The impossibility moves into the regulatory room. It does not evaporate upon arrival. The Russell Sage book addresses this directly — the interaction between algorithmic classification systems and the social choice problems embedded in their governance is, essentially, the subject of its ninth chapter.
I am embarrassingly excited to see the return of The MoP. Yes, the theorem has sequels, and those sequels are the inexorable products of individuals’ rational responses to classifiers. It seems that social scientists and societies in general are forever plagued by the data generating process. Observe, classify, observe once again, classify again, rinse and repeat. Too much fun!
Keep the posts coming.
Thanks, Scott! 🙂 Hope all is well in Athens and hope to see you soon!