Covid Experience Dataset

In March 2024 I shared a survey online to see how covid cautious people are, and whether there’s a correlation between levels of caution and number of covid infections. The resulting dataset is rich, complex, and deeply flawed (like all surveys and most real datasets!). There’s scope for purely statistical investigations, and deep discussions about the ways in which such a survey is flawed.

I’m now releasing the full dataset (and it’s live, so more results may come in) for student projects and any other investigations. Like all material on this site, it is shared under a Creative Commons Attribution Non-Commercial Share Alike License, and I would love it if you shared your projects, thoughts, and results back with us, so that we can share them with interested teachers and students.

The original survey is still available to fill in.

Screenshot of the Covid Experience Questionnaire. "The goal of this survey is to get a sense of how cautious you are being about covid, how many times you have had covid, and whether there's any connection. This form is anonymous, and entirely voluntary.  "
The first question reads "How often do you take these covid19 precautions?"

And the dataset is available here as a Google Sheet that you can download as an Excel file or CSV, among other formats. Bear in mind that columns O, P, Q, T, and W contain free text and we cannot guarantee that people won’t write inappropriate things. You might want to delete those columns before sharing with your students.

Some of the obvious problems include:

  • Sampling bias – people motivated to fill in the survey are probably more covid cautious than average, plus my connections are probably even more so.
  • Many, MANY problems with the questions, including probably the worst one, which is that there is no option in the first question for “I don’t do this activity at all, it’s too risky”.
  • No one knows for sure how many covid infections they’ve had, because it can be asymptomatic, or dismissed as a cold or a flu. People don’t always test, tests aren’t always accurate, etc.

What other problems do you see? What problems will your classes come up with? And what analyses will they do? It’s interesting to create a “risk aversion” score based on the first question. There’s no obviously correct way to do that, so I would love to know what your students come up with!

Another challenge is how would you visualise the results of the first question?

Please share this page widely, and share your ideas results back, either as comments here, as email to contact@adsei.org, or in the Teachers Using Data Science Facebook group.

To support work like this, and build data literacy and STEM skills in schools, you can donate at givenow.com.au/adsei/

2 thoughts on “Covid Experience Dataset”

  1. @lindamciver
    #COVID

    Thank you for sharing!

    This partly answers a question I have had. And it aligns with the experience of people I know.

    For this data set, small as it is, in four years people on average have been infected:
    5 times +/- 1.2.

    2.4% no times

    My expectation based on modeling was 1.2-1.6 times per year. This is 1.25.

    With the latent impacts on immune health, and aging, this bodes very poorly for the future. That cannot now be changed.

    ~1 person in 30 may be ok in 3 yrs

Comments are closed.