Exploratory Data Analysis of SARS-CoV-2 in Cook County Wastewater
Daniel P. Hall Riggins, MD
November 12, 2022
I am working on a project to support Cook County Department of Public Health with wastewater surveillance of SARS-CoV-2. We want to detect surges of viral copies in wastewater and see if those surges can act as early warnings for surges in hospital cases of COVID-19. In this post, I will perform some preliminary exploratory data analysis of the data.
Dataset and Prep
The data was derived from the CDC’s National Wastewater Surveillance System. Please see this git commit for a specification of the data preparation pipeline I performed using the {targets} package.
Laboratories report results as concentration of viral copies recovered per liter of wastewater at each sampling site. In order to enable standardized comparison of samples across different sampling sites, the CDC recommends standardizing by:
Efficiency of viral recovery during the sampling process (variable rec_eff_percent). This is estimated by spiking wastewater with a known quantity of a different virus and seeing what proportion is recovered.
Flow rate of wastewater at the sampling site (variable flow_rate in millions of gallons per day).
Number of people supplying waste to the sewershed (variable population_served).
After standardizing by all these variables, the measurements units convert to million viral copies per day per person (variable M_viral_copies_per_day_per_person).
There are 7 wastewater treatment plants at which SARS-CoV-2 is sampled in Cook County:
# A tibble: 8 × 1
1 Calumet, South Suburbs and Chicago
2 Egan, Far Northwest Suburbs
3 Hanover Park, Far Northwest Suburbs
4 Kirie, Mid Northwest Suburbs
5 Lemont, Far Southwest Suburbs
6 O'Brien, Northeast Suburbs and Chicago
7 Stickney (1), West Suburbs and Chicago
8 Stickney (2), West Suburbs and Chicago