Covid19 dataset

COVID-19 Data Set

The Starschema COVID-19 Data Set collates a range of important resources for assessing the impact, severity and response to the COVID-19 pandemic. The data is stored in Snowflake for ease of access. Detailed information about what is in the data set is available on the project's github repository. The METADATA table in the Snowflake Data Marketplace share contains detailed column-level information about the tables that comprise the share.

A range of data sets have been published that are useful for monitoring and understanding the spread of COVID-19. Our efforts are intended to collate, curate and unify the most valuable data sources for enterprises, individuals and public health experts to assess the situation and make data-driven decisions. This single-source easily blends with other data sources so you can analyze the movement of the SARS-CoV-2 pandemic over time, in any context.

To date, over a thousand organizations worldwide, from global corporations to public institutions and NGOs, have relied on the Starschema COVID-19 Data Set to plan their response to the pandemic. Watch Snowflake CEO Frank Slootman describe its impact during his keynote at the 2021 Snowflake Summit:

As of January 2021, the Starschema COVID-19 Data Set includes vaccine tracking, providing verifiable, public information with data such as doses allocated, shipped and administered worldwide and on a US state level.

Covid19 image 4

Currently added data sets include

NameSourceTable name
US COVID-19 testing and mortalityThe COVID Tracking ProjectCT_US_COVID_TESTS
Global data on healthcare providersOpenStreetMap, via Healthsites.ioHS_BULK_DATA
Global case countsJHU CSSEJHU_COVID_19
US healthcare capacity by state, 2018The Henry J. Kaiser Family FoundationKFF_HCP_CAPACITY
US policy actions by stateThe Henry J. Kaiser Family FoundationKFF_US_POLICY_ACTIONS
US actions to mitigate spread, by stateThe Henry J. Kaiser Family FoundationKFF_US_STATE_MITIGATIONS
ICU beds by county, USThe Henry J. Kaiser Family FoundationKFF_US_ICU_BEDS
Italy case statistics, summaryProtezione CivilePCM_DPS_COVID19
Italy case statistics, detailedProtezione CivilePCM_DPS_COVID19_DETAILS
WHO situation reportsWorld Health OrganizationWHO_SITUATION_REPORTS

The COVID-19 data set enables enterprises, individuals and public health authorities to make data-driven decisions. Using the data on local case counts, enterprises can monitor the integrity of their supply chains and anticipate disruptions. Public health authorities can track the spread of COVID-19 and use the data to support public health measures such as school closures and estimate the relative risk of incidence in their region. Presented in an analytics-ready format and diligently maintained, the data set provides a single source of truth to enable decision-making based on the most reliable and accurate data available.

In addition, to give citizens, enterprises, and NGOs a view of progress towards meeting the quantitative gating criteria in every US state, we created the Case Trajectory Status Visualization. This is a publicly available interactive data visualization that indicates the trajectory of cases as well as the trajectory of positive cases as a percent of total cases.

This is also available as a free-of-charge starter dashboard that can be integrated with your company's own data sets for deeper more company-specific analysis.

Request access to the free, public Starschema COVID-19 Epidemiological Data Set on the Snowflake Data Marketplace here.

Special thanks to our partners Snowflake, Tableau, Mapbox, Path, and Datablick for collaborating on this project.