Pick One from Three (All Three Numbers Are Correct)

Last week’s post ended with a theorem. This one starts with a dashboard.

The theorem — Arrow’s impossibility result, applied to the aggregation problem that Simpson’s paradox creates — is on the record in All Measurements Are Local if you want it. The short version: when subgroup results conflict with each other and with the aggregate, the question of which result to trust is not a statistical question. It is a normative one, and Arrow’s theorem guarantees that no aggregation procedure can answer it without making a choice that some reasonable person would object to. The impossibility doesn’t dissolve with better data. It is structural.

Knowing that is one thing. Feeling it is another. Today I want to show you what it feels like to be the person who has to choose — which is a rather different experience than reading about why the choice is hard.

So: welcome to Northbrook Unified School District, where the Year 1 results from the Elevate curriculum are in, the board meeting is Thursday, and somebody has to present a headline number. That somebody is you.


Before you open the dashboard, let me say one thing about the species of argument that is most likely to generate this kind of trouble. Simpson’s paradox is not uniformly distributed across policy debates. It shows up with particular reliability when the intervention being evaluated also changes the composition of the group being measured — when the treatment and the denominator are not independent. An educational curriculum that expands enrollment does not just change scores. It changes who is being scored. A public health program that increases uptake among high-risk populations does not just change health outcomes. It changes the risk profile of the treated population. In both cases, the aggregate shifts for reasons that are entirely separate from whether the underlying program is working — and the aggregate, being what it is, does not volunteer this information.

Northbrook is a stylized version of this. The numbers are fictional. The structure is not.


One note before you open the dashboard. The poll does not include an option along the lines of “it’s complicated” or “I would present all three numbers and let the board decide.” This omission is deliberate. Presenting all three numbers and letting the board decide is not a neutral act — it shifts the aggregation problem to whoever has the most power in the room. The complexity does not disappear because you declined to resolve it. It relocates. This, too, is a theorem, though not Arrow’s.1

NORTHBROOK UNIFIED EST. 1952
Northbrook Unified School District
Academic Progress Portal  ·  Elevate Curriculum  ·  Year 1 Results
SAPA Composite  —  District-Wide
69.7
▼ 2.8 points from baseline
Pre-Elevate baseline: 72.5
District demographics
Jefferson Elem.200
Lincoln K–8300
Roosevelt Elem.100
Total enrolled600
Baseline year figures
Jefferson
Elementary School
Baseline SAPA85.0
Current SAPA88.0
▲ +3.0
points from baseline
Lincoln
K–8 School
Baseline SAPA70.0
Current SAPA73.0
▲ +3.0
points from baseline
Roosevelt
Elementary School
Baseline SAPA55.0
Current SAPA58.0
▲ +3.0
points from baseline
District composite: 69.7  ▼ 2.8 pts   ·   Jefferson, Lincoln, Roosevelt: ▲ +3.0 pts each
Your turn.
You are presenting to the Northbrook school board on Thursday. All three numbers below are computed from the same data. All three are correct. Which one do you lead with?
Pick the metric you would report. No abstaining.
A
District composite score
69.7  ▼ 2.8 points
Weighted by current enrollment. The figure the state uses for accountability ratings.
B
Within-school improvement
+3.0 points  ▲ at every school
Controlling for enrollment composition. The figure that measures whether the curriculum works.
C
Highest-to-lowest school gap
30 points  — unchanged
Jefferson minus Roosevelt, before and after. The figure that measures whether the curriculum is equitable.
A — District composite  ▼ 2.8 pts
Accountability metric
B — Within-school gain  ▲ +3.0 each
Effectiveness metric
C — School gap  30 pts, unchanged
Equity metric

I will tell you, in a follow-up post, what you did not know when you answered, and why it matters more than any of the three numbers on the screen.

With that, I leave you with this.


1 The relevant result here is the conservation of impossibility, which this blog has visited before: formal impossibility results do not disappear when you delegate the decision. They relocate to wherever the decision is now being made, carrying the same structure they had before the delegation. Kicking the aggregation problem to the board does not solve the aggregation problem. It makes the board the aggregation procedure.