Influential labor economist George Borjas is out with a new working paper revisiting the famous Card (1990) result on the Mariel Boatlift. The Boatlift was a huge, plausibly exogenous immigration shock felt by Miami in 1980. Card had originally found that the Miami labor market had seemed to absorb the immigrants without an impact on native wages. Borjas' working paper challenges that result.
In brief, Borjas argues that Card should have focused on a narrower subset of the Miami labor market: specifically, that of high school dropouts, since most of the new immigrants were high school dropouts and would be competing in that market. When Borjas does this, he finds huge negative effects -- on the order of 30% -- on native wages in Miami shortly after the boatlift.
I've spent some time today playing around with the data that Borjas used. (Full disclosure: I haven't yet been able to replicate his regression results exactly, but I'm within a few hundredths.) Borjas' central challenge is inference. First, the sample size of individuals is tiny, as Borjas acknowledges. Using the March CPS, he's looking at the wages of non-Hispanic men of a certain age, who are high school dropouts, in the Miami-Hialeah metropolitan area. In each year, there are something like 20 people that meet those criteria, and similar numbers in the placebo cities. At the end of the day, though, that just introduces measurement error in the dependent variable, which we know how to deal with.
The bigger challenge is that he has one treatment city and a small number (generally) of treatment cities. In fact, in his regression, he doesn't even see the point of clustering his standard errors since the number of clusters is so low. The reported "robust"-to-heteroskedasticity standard errors are close to meaningless, obviously.
He spends most of his inference effort in producing a distribution of placebo estimates, and seeing where Miami's post-Boatlift change in wages falls in that distribution. The more sophisticated way to do that is via the synthetic controls method, which Borjas does.
I spent some time today looking at the simpler test, where the placebo estimate is just an uncontrolled pre-post change. In particular, a given placebo estimate is the change in some unaffected city j from years (t,t+1,t+2) to (t+4,t+5,...,t+9). That is, the placebo treatment occurs at t+3, the pre-period is three years before that, and the post-period is six years after that. With this distribution, he plots the following graph, showing that the Mariel pre-post change in Miami is at the far left tail of the distribution. In particular, about 0.8% of the mass of the distribution is to the left of the Mariel effect.
My replication looks pretty similar. (Borjas does some sort of weighting that I didn't do, so they're not exactly the same.)
I do have one objection to this procedure, however. He chooses a six-year "post" window because that is what makes the pre-post change in Miami look as bad as possible. Given that choice, we should do the same thing for our placebo estimates: for a given city j and treatment date t+3, we should choose the worst treatment effect. I replicated this exercise with this change (letting the post-period be anywhere from 2 to 7 years long, whichever is worst).
So, in sum, my objection doesn't overturn Borjas' conclusion. But I do wonder what happens when you repeat this exercise with the synthetic control placebos.