This is the question that any researcher (or debunker) must ask and answer before doing anything else in order to avoid the common pitfalls that lead to specious inferences. We here at Twenty-One Debunked believe that any drinking age study that fulfills all of the following criteria is ideal:
- The study is longitudinal, also known as "panel data." In other words, it contains both a time element and a cross-sectional element. Most studies discussed on this blog are longitudinal. (10 points)
- The study contains a minimum of 10 consecutive years of data, ideally beginning before 1980 and lasting well after 1990. (Subtract 1 point from criterion #1, for every year less than 10. Add 0.5 points for every year greater than 10, maximum 15 points)
- The study includes all 50 states and DC. (5 points)
- The study includes Puerto Rico. (5 points)
- The study controls for state and year fixed effects. (10 points, 5 for each one)
- The study controls for state-specific time trends and/or other types of trends. (5 points)
- The study controls for all of the following variables: vehicle miles traveled, population of the age group(s) studied, number of licensed drivers, real per capita income, unemployment rate, seat belt laws (primary and secondary), speed limit, average vehicle size, tourism expenditures, BAC limits of 0.10 and 0.08, zero tolerance laws, administrative license suspension (ALS), sobriety checkpoints, other drunk driving laws, money spent of DUI enforcement, real beer tax (or price), gas price, dram shop laws, average outlet density, keg registration, and driving age. (0.5 points for each variable controlled for, 1 point for those in red)
- The study controls for demographic characteristics of each state's population (race, ethnicity, religion, etc.). (2 points)
- The study tests for border effects. (2 points)
- The study controls for adult per-capita alcohol consumption, but the models are tested without this covariate as well, and both results are reported. (1 points for with only, 1 points without only, 2 points for both)
- The study tests all four legal drinking ages: 18, 19, 20, 21. (3 points)
- Grandfather clauses are accounted for. (2 points)
- The study tests for robustness the effect of including or excluding certain states and time periods. (3 points)
- The study is done using total fatalities as the dependent variable, as well as using single-vehicle nighttime fatalities as a proxy for a crash being alcohol-related. The latter avoids reporting bias and testing bias associated with a crash being labeled as "alcohol-related." (3 points for using total fatalities, 2 points for alcohol-related fatalities, 5 points for single-vehicle nighttime fatalities, 2 points for a day/night counterfactual)
- The study separates data into four age groups: 15-17, 18-20, 21-24, and 25-29. The last one should be used as a control group for each of the other three, since it is too distant to be affected by the drinking age. DO NOT use 21-24 year olds as a control group! (2.5 points for each of the four age groups)
- Each model is tested for robustness by using driver fatalities in each age group studied, and/or using fatalities of any age that involved at least one driver of the age group studied. (3 points)
- If data before 1975 (the year FARS was created) are included, the models are further tested for robustness using 15-24 year old total fatalities. (-5 points if not)
- For studies that look at effects that are not traffic-related, additional variables need to be accounted for. (unrated, does not apply to every study)
- Regressions contain an error term. (3 points)
- R-squared values are reported, or some other measure of goodness-of-fit. (3 points)
- Tests for statistical significance are done at the 10%, 5%, and 1% levels, and/or 95% confidence intervals are reported (3 points for 5% level, 5 points for all three, 5 points for confidence interval).
As far as we know, not a single study meets all of these criteria. But a few do come very close indeed. As a result, we at Twenty-One Debunked generally consider any study that meets 90% of the above rubric's points as very high quality and one that meets 80% as high quality, provided that criteria #1 and #5 are met in full as well. We are not aware of any study that meets #4, which is unfortunate because Puerto Rico's drinking age is still 18 to this day, and there are otherwise no data points for a drinking age below 21 after 1988. And they are in the FARS database as well. However, one could potentially consider Louisiana to have had a de facto drinking age of 18 until 1995 due to a massive loophole that has since been closed.
The following scale is what we use to grade studies based on the above rubric:
90-100 points: A (very high quality)
80-89 points: B (high quality)
70-79 points: C (decent quality)
60-69 points: D (marginal quality; near-junk)
Below 60: F (inferior quality; junk)
For cross sectional, time-series, and pre/post studies, which are weaker than longitudinal studies and thus can never be considered high quality (points can never exceed 79), our criteria are more stringent. Fixed effects and time trends obviously cannot be controlled, so additional variables need to be accounted for in cross-sectional studies. Theoretically, if enough variables are controlled for, a cross-sectional study can closely approach the accuracy of a longitudinal one, and would be considered "decent quality" for our purposes. For time-series and pre/post studies, there needs to be a suitable comparison group (or state) as well. The comparison group may NOT simply be 21-24 year olds in the same state since there is some evidence that fatalities may be shifted to/from that age group when the drinking age changes. It would be better to use 25-29 year olds AND to control for as many observable variables as possible in that case.
For those that don't know:
Fixed effects and time trends are an easy way to control for some omitted variables (with caveats of course). State fixed effects control for relatively permanent yet unobservable characteristics of each state, as long as they do not change significantly over time. Year fixed effects control for national trends that are the same in every state (safer cars, etc.). State-specific trends are the interaction of the state and year fixed effects. Time trends can be linear or quadratic.