# Misleading death rate in viral outbreaks?

#### Whammamoosha

##### New Member
Currently, and especially in this COVID-19 outbreak, the death rate divulged during an outbreak is calculated as:

total # of deaths ÷ total # of cases

What bugs me is that the 2nd term of the expression (which include ongoing cases) follows a geometric progression over time, whereas deaths and remissions (closed cases) take, as divulged, 14 days in average to occur after diagnosis.

14 steps in a geometric progression mean a lot, and given an optimistic cases increase of 10% a day (in the COVID-19 outbreak it's much higher), this gives a 280% increase in that period.

And since the total # of deaths originate from a total # of cases that is around 14 days older, what's the meaning of the expression?

If the expression tries to reflect the chance of dying after getting diagnosed positive for the virus, it is completely misleading, since the 1st term and the 2nd term reflect different points in time regarding the diagnosis, and the 2nd is much higher (280% in the example) than it should be. Thus the correct expression should be

total # of deaths ÷ total # of closed cases

because both terms refer semantically to the same universe of cases (deaths + remissions = closed cases).

Both expressions will inevitably converge to the same result (after the outbreak all cases will be closed cases), but the latter will get to the actual rate much faster, because the former has an implicit time delay.

So WTH is the former being used?

Last edited:

#### noetsi

##### Fortran must die
They are expressing it as the number of deaths divided by the number of known cases. Both are wrong because some now sick will die and it is not known how many are actually sick since most cases that are mild will not come to the notice of the health care community. With a new disease, especially one like this, there are few options.

#### Whammamoosha

##### New Member
They are expressing it as the number of deaths divided by the number of known cases. Both are wrong because some now sick will die and it is not known how many are actually sick since most cases that are mild will not come to the notice of the health care community. With a new disease, especially one like this, there are few options.
The source data will always be incomplete, but using available data using a bad approach will always be worse.

#### noetsi

##### Fortran must die
I think making estimates at all given a lack of data is rarely a good idea.

#### fed2

##### Active Member
It occurred to me that the problems with these case fatality ratios are somewhat linked to reason why epidemiologists tend to express findings as relative-risk or other ratios among members of a cohort. Its easier to estimate relative rates of things. The denominator in the fatality ratio is inherently a population wide measure, and so requires the census bureau to do some sort of prevalence survey. On the other hand, the case-fatality ratio follows a sort of 'them the numbers jack' approach to stats. You got #dead, you got #cases, you divide, voila. At least you have some circumspect on the issue which is what is required these days i guess.

#### hlsmith

##### Less is more. Stay pure. Stay poor.
It is the age old issue of making calculations using crossectional data. I can run logistic regression or survival analysis. The latter will provide the hazards for time to event, but people run logistic for 6-month survival missing that rates may very.

if the above topic gets your juices flowing read papers by miguel hernan on time to event data.

PS, epidemiologist use compartmental models as well which plot recovered and death rates.