Skip Navigation

How did 4,707 coronavirus cases go missing? The devil’s in the data

The state has no plans to reconcile the two numbers — at least for now.

  • Wallace McKelvey/PennLive
Vials with samples taken for the new coronavirus are counted before they are prepared for RNA testing at the molecular pathology lab at Tulane University School of Medicine in New Orleans, Thursday, April 2, 2020.

 Gerald Herbert / AP Photo

Vials with samples taken for the new coronavirus are counted before they are prepared for RNA testing at the molecular pathology lab at Tulane University School of Medicine in New Orleans, Thursday, April 2, 2020.

Somewhere in Philadelphia, 4,707 people are going about their day — or perhaps not — having survived or succumbed to COVID-19.

Their cases were recorded by local health officials, but those patients are missing from the state Department of Health’s official tally.

“The discrepancy is not a new thing,” said James Garrow, a spokesman for the Philadelphia Department of Public Health, who noted that the gap between the local and state accounting was once even larger.

Currently, the city is reporting 29,102 positive cases versus the state’s tally of 24,395. The number of deaths is also off, albeit by a far lesser extent: The city reports 1,675 deaths compared to the state’s 1,665.

Note: If you cannot see the embedded graphic above, click here.

Nate Wardle, a spokesman for the state health agency, said there are no doubts about the veracity of Philadelphia’s data. Regardless, the state has no plans to reconcile the two numbers — at least for now.

“Some of the cases that have been reported by Philadelphia have not been reported to the department at this time,” Wardle said in a written statement.

And the error ripples outward: Johns Hopkins University and The New York Times use Philadelphia’s figure; the Centers for Disease Control and Prevention, The Washington Post and PennLive use the state’s.

How did this happen?

Philadelphia and the state use different computer health reporting systems. Medical providers should be submitting information about coronavirus testing to both, but in practice that didn’t always happen, particularly in the early days of the pandemic.

Garrow said the data the state received through its system, one that was created in partnership with the CDC, was incomplete. Not only did that result in different totals, the state’s database sometimes misses key information, including addresses and demographic data, such as race.

“Within the last couple of weeks, our departments came to an agreement on how to handle the difference,” he said. “Essentially, we acknowledged that early tracking between the two systems was not perfect, and we would go forward using PDPH data.”

Now, Garrow said, Philadelphia health officials upload each day’s testing data to the state database after checking patient addresses and add any demographic data they have available.

This means the city and the state’s daily figures should be uniform going forward. But the larger discrepancy will remain for the foreseeable future.

“There is every expectation that it will be done at some point,” Garrow said, “but our epidemiologists feel that it’s more important to focus on counting new cases in the middle of a pandemic as opposed to making sure that potentially months-old cases are both represented in databases.”

For example, Philadelphia launched its contact tracing program this month. Those efforts, aimed squarely at reducing future infections, are understandably a higher priority than correcting past errors.

Matt Rourke / AP Photo

Sheila Henry, center, waits in line to receive a COVID-19 test outside the Pinn Memorial Baptist Church in Philadelphia, Wednesday, April 22, 2020. 

It’s useful to put this discrepancy in context: If the state followed the lead of New Jersey and other states by reporting the missing cases all at once, it would more than quadruple the rolling average daily statewide case count.

That may be worth doing in the interest of transparency but it wouldn’t add much to our understanding of COVID-19′s current spread, said Krys Johnson, a Temple University professor of epidemiology and biostatistics.

“This happens every Monday when all the tests from the whole weekend have been logged,” Johnson said. “We see a peak in testing.”

Ideally, she said, both systems need to reconcile their numbers retroactively. Dumping all of the old cases into a single day as though they were just reported would be counterproductive.

“What’s more important is to look at the trend, not necessarily the numbers themselves,” she said. “What has it looked like for the last seven, 10, 14 days?”

Of course, Pennsylvania is a difficult spot already when it comes to spotting recent trends due to a growing lag in processing and returning test results.

That, Health Secretary Rachel Levine said Thursday, “is a troubling national issue that will require a national solution.”

Johnson said the longer it takes for a test to be reported, the longer it takes for contact tracers to spring into action and the less useful the data becomes for preventing the future spread of the disease.

“There’s no public health response to do seven-to-10 days later,” she said. “People are already symptomatic, infectious and they’ve already been around other people.”

While Philadelphia is the largest and arguably most troubling discrepancy in the state’s data, due to the size of its population and its status as an early hotspot, it’s hardly the only one.

A number of counties, including neighboring Delaware and Montgomery counties, have reported different data compared to the state. And others, such as Schuylkill, have seen major retroactive corrections upward or downward.

Wardle said the most common source of error in the state’s reporting system comes when the address reported by a testing lab is not the patient’s actual home address. “This can lead to the counties for cases changing once the address of residence is determined as part of the case investigation,” he said.

The system is also subject to lags.

“Providing daily data updates in the midst of a pandemic can lead to some data challenges,” Wardle said. “Typically, this type of data is not released as frequently, but we are working diligently to provide accurate and complete data.”

On Wednesday, for example, the state did not receive data from Philadelphia in time for its daily noon update. The result: No new Philly cases that day and a glut of cases the next.

Death data has also been a source of confusion due to the patient location question as well as the state’s inclusion of probable cases that were never conclusively confirmed, said Lycoming County Coroner Chuck Kiessling, who also serves as president of the Pennsylvania Coroners Association.

“If it’s a positive, it’s a positive,” he said. “We’re not going to mix in probables. That’s like me walking onto a scene, seeing white powder and assuming it’s an overdose.”

The trouble, of course, is that some of those who exhibited all of the symptoms of COVID-19 — particularly in the pandemic’s chaotic early months — were never tested due to supply shortages.

Early on, Kiessling said, many hospitals and nursing homes failed to notify their county coroners or the state, exacerbating differences between state and local data. For the most part, he said, that’s been resolved, although problems still crop up from time to time.

“In Lycoming and smaller counties without health departments, when there are questions, EMS and law enforcement and the general public reach out to the coroner’s office,” he said. “The coroners end up sorting this stuff out.”


PennLive and The Patriot-News are partners with PA Post.

Support for WITF is provided by:

Become a WITF sponsor today »

Support for WITF is provided by:

Become a WITF sponsor today »

Up Next
Health

Why shame is a bad public health tool — especially in a pandemic