My son is just about finished in his geophysics degree, which means that I read his dissertation. (Well done Joe!). An interesting little phrase stood out to me – in describing the process of making the measurements that went into his modelling of underground water flows, he began by talking about his trip out to the site to “establish ground truth”.
This is a geological term, which in context basically means that a dataset is not to be trusted until you’ve been to the place where it was collected, had a look at the rocks, dug a hole, walked along the river or whatnot. The reason for doing this is geology is a game of interpolation – you have a set of data points, but they are generally of significantly lower resolution than the scale at which things happen, so you need to know what kind of theoretical model to impose on the data in order to join them up. And the only way to do that is to walk around, having a look at stuff with the much higher-resolution data gathering apparatus on the front of your head; you can’t see into solid rock, but the things you can see will tell you what kind of things could be happening below the ground.
Ground truth is part of the reason why the Geological Society, the professional body in the UK, won’t give accreditation to university degrees that are entirely based on desk research; there are geophysics and earth sciences degrees, but if it’s going to count for professional qualifications, it has to have a significant component of fieldwork. (They had to make a few compromises on this during the COVID-19 pandemic, but it’s still a basic principle).
I can’t help thinking that economics could benefit a lot from a similar approach, as could other social sciences. We also use datasets that are collected at quite low resolution (most obviously in macroeconomics, but I would guess the problem might be even greater in micro, because they are less likely to realise it). A lot of econometrics is all about trying to test whether a model is well-specified, but trying to get the data to tell you what’s missing from the data is always going to be an uphill struggle. A bit of ground truth and informed application of theory is much more likely to give useful results than ever so clever “empirical” and “data-driven” mathematical fireworks.
You can actually see this if you read a lot of working papers – some authors begin with a description of the thing they’re trying to describe, and some just go straight into the data with a short section on how they got it. The ones which make even a cursory nod in the direction of ground truth are much less likely to be a complete waste of time. I think I’m going to stop reading economics research which doesn’t do so.