Which Comes First, Theory or Data?

It’s kind of a trick question, exactly the type of gambit that drives both research and blog posts. (The answer, it seems, is “both should magically emerge simultaneously.”)

Anyway…I’ve been in a bit of a funk lately, and not the twerking kind.  Both the seasonal goings-on and my mind doing laps on a vexing problem have left me a bit, ummm, unmotivated to post.  Without further delay, let’s make petroleum jelly out of petroleum…here’s my intellectual/professional quandary/blogging impediment in the form of a blog post.

1. I have a lot of data (BIG data) that I just know is important.  Basically, it is (a big part of) the substance of federal policymaking.

2. I don’t know any theories that really speak to it.

3. Well, I have some, but they are both intractable as presently formulated and I don’t know how best to simplify them to get results.  I have strong hunches about how I could do so, but I’d like to choose the simplifications that are most appropriate for the questions I want to answer.  (In a nutshell, the questions I want to answer are “why are some issues raised and acted upon while others are set aside,” and “how are people and resources deployed across multiple issues at a given point in time?”)

So…what to do?  (If you think about it for a moment, my conundrum is very meta.)

From a “math of math of politics” angle, the real rub is exemplified by the astute question raised by one of my colleagues when I described something I was doing/wanted to do with this big data.

“But what’s the theory?”

I have been told by other colleagues that theory is not necessary—though highly desirable—for empirical social science.  I fundamentally and, if I say so myself, quite correctly disagree with this assertion on theoretical grounds.  (See what I did there?)  But, the more practical rectitude of my assertion that there is no empirical analysis without some kind of theory—in terms of interpreting, publishing, and communicating empirical analysis—is also illustrated by my (empirically focused) colleague’s question.

More “math of math of politics” is raised by the fact that my absolute advantage in terms of scholarly production is in theory (really, modeling), rather than “pure” empirical analysis.  So, maybe I should take a hint and take a leap, “doing the models” that I can do, and letting others sort them out.  After all, I’m tenured, and therefore have the freedom to take the time to do this—the bulk of “the time” in this case (in my expectation, at least) will be navigating what I expect will be a bumpy road to publication and communication of the models.  I foresee plenty of (in-the-weeds) speedbumps in pursuing a “pure theory” approach to the questions I am interested in.

The irony, of course, is that one might think that tenure is at least partly there to motivate me to “take risks” in the sense of really trying to do things right.  In this case, the right path in terms of getting people to listen to the ultimate analysis might involve developing models that, at least now, can not be motivated by empirical verisimilitude.

So, what to do, I ask you.  Most of you are social scientists, and I am honestly befuddled by the proper way to aggregate/trade-off the two competing intellectual incentives: should I patiently, doggedly, and perhaps inefficiently chase the (as now unknown) “right analysis,” or do the analysis that will be more readily heard and may accordingly grease the wheels for ultimate production and communication of the right analysis?

With that, I leave you with this.