Came across this interesting piece around the estimation of flu impacts. From Slate. One of my favourite web-sites.
You must have across the new articles which say that flu caused so many thousand deaths in a certain year. Now attributing deaths to flu is not as straight-forward as it would seem. Flu is not the "Cause of Death" that often in a death certificate. Flu usually kills by causing secondary conditions like pneumonia, heart disease, etc. which the enfeebled body is not able to resist. So one can find relatively few cases where the cause of death can be directly attributed to the flu. So how is the estimation done? The answer is simple regression using deaths as the dependent variable and the number of flu cases as the independent variable.
One piece of data is the number of deaths in the US. This can be broken down by week or by month for the flu season. (Approx Oct to Apr) The other piece of data is the number of flu cases tracked by various testing labs across the country. This information is also available broken down by week or by month. The CDC website is a ready source of such morbidity, going back at least to the early 90s! Then it is a matter of running a simple regression to create a link between flu cases and deaths. The regression takes the form: deaths = intercept + co_eff * number of flu cases, the intercept being the number of deaths one can expect due to other baseline causes.
It almost seems too simple to be true. How can you be sure that deaths caused in a certain month can be linked to flu cases from that period? Or does one assume a certain lag for flu to lead to mortality? How do we know we have normalized for everything else? What is the CI of the estimates? Check this paper out by William W. Thompson for more details!
Now with the emergence of the potentially more deadly H1N1 flu, how can one go about estimating its impact?