I’ve been reading several papers lately that examine the effects of various government policies on various social and economic outcomes. Increasingly, I find myself wondering what these studies actually conclude with “null” results. (By the way, I am sure that this issue has been raised before, but I’ve been thinking a lot about it lately, and I figured that’s what a blog is for.)
A (justifiably) standard approach in these literatures is as follows:
1. Describe why the outcome variable, y, is important, how it is measured, acknowledge weaknesses in the data, etc.
2. Describe the vector (list) of K independent variables, X, acknowledge they are imperfect, describe why they are still arguably useful, and perhaps link these with a theory explaining why they might affect y.
3. Apply a statistical model to generate estimates of the effect of the various variables in X on y.
For a lot of very good reasons, the standard approach in thinking about (or “modeling”) the effect of X on y is as based on some equation that essentially boils down to the following:
,
so that essentially measures the linear impact of variable
on the outcome variable,
. (The function
captures nonlinearities, particular for situations in which
is meaningfully bounded, like a proportion or probability.)
Then, typically, if the researcher is unable to reject the hypothesis that the estimated value of ,
is equal to 0, the conclusion is that there is little or no evidence that
affects
. This is usually followed by a puzzled expression and an awkward pause.
In many respects, this is perfectly reasonable: this approach is a classical way to model/uncover the relationship between the outcome variable and independent variables. And, particularly in modern social science, it is broadly and well-understood as a means to conceptualize/present results. So, I’m not saying we shouldn’t do this. That said, I am saying that we should think about the political relationship between the outcome and independent variables.
Now, for the sake of argument, suppose that , to focus the discussion. Then, suppose that
is a politically important variable that voters “like” (i.e., want higher levels of), such as per capita income in a state and that
represents a policy controlled/set by political actors. Now, suppose that political actors are responsive to voter demands, so that they set
so as to maximize
.
The first order condition for maximization of with respect to
is
. In general, $f$ is a strictly increasing function, so that
implies that
.
We have reached this conclusion without presuming anything about the true relationship between and
. Thus, if one is unable to reject the null hypothesis that
, isn’t it arguably better to conclude that the marginal effect of
on
is zero, given the observed data and behaviors underlying them than that
has no apparent effect on
?
Put another way, if we find in observed, real-world data that the effect of on
is unambiguously non-zero, shouldn’t we be more surprised than if we fail to uncover a systematic, non-zero (linear) effect of
on
?
With that, I leave you with this.