Washington State Board of Health

Text mining and content analysis of 80,000 public comments on mandatory school COVID-19 vaccinations.

qualitative research
quantitative research
data analysis
text mining
natural language processing
sentiment analysis
Author

Elham Ali

Published

April 30, 2022

Challenge

The Washington State Board of Health (SBOH) considered adding the COVID-19 vaccine to the list of required immunizations for school entry, prompting a surge in public engagement. This proposal led to the formation of a technical advisory group in December 2021. However, the Board was overwhelmed by more than 30,000 emails and 50,000 community comments, including both legitimate inquiries and organized misinformation campaigns. With only two staff members available to manage media relations, public records requests, and customer service, SBOH lacked the capacity to analyze the feedback effectively. The challenge was to develop an approach that could handle the massive volume of data while understanding the context and sentiment of the responses.

Approach

Note

Watch my recorded workship in partnership with Innovate(US) on “How to Apply Human-Centered Design to Government AI Projects.” Learn how I integrated human-centered design and equity principles into the design and use of AI for the Washington State Board of Health project. View code on Github

Amid strong public opposition to mandatory COVID-19 vaccinations in schools, the Washington State Board of Health needed data-driven insights to understand public satisfaction, dissatisfaction, and factors influencing vaccine acceptance. Collaborating with U.S. Digital Response, I worked alongside Dr. Chimobi Ucha, a computer and data scientist, and the Board’s Communications Manager and Policy Team to analyze 80,000 public comments from emails, survey responses, and public meeting transcripts.

My approach began with sentiment analysis and text mining of all feedback, using R for text mining and sentiment analysis. My teammate and I created a phased pipeline to process comments from town meeting transcripts, emails, and survey responses, using a combination of R libraries. The analysis pipeline consisted of a scraper, relevancy model, sentiment model, text analysis model, and thematic analysis, which allowed us to classify comments as positive, neutral, or negative.

We focused on identifying key names, organizations, and keywords associated with misinformation campaigns and used frequency analysis and topic modeling to identify common themes. A random sample of comments was further analyzed thematically using Atlas.ti and Miro. This systematic approach significantly improved our precision-recall score from 34% to 91%-92%, allowing us to uncover the underlying beliefs, values, and attitudes driving public opinion on the vaccine mandate.

Results

Our findings helped the Board better understand public sentiment and shaped more effective communication strategies, centered on open and transparent discussions in public meetings. The insights we provided also informed improvements to the data collection process, so that the Board can gauge resident opinions and respond proactively to concerns. The results and our process, including the analysis code and repository, were shared with the Board for future use.

Analysis of the most frequent words in survey responses

Analysis of the most frequent words in survey responses

Most common words found in public comments

Most common words found in public comments

Mapping keywords and influential entities associated with COVID-19 in Washington State

Mapping keywords and influential entities associated with COVID-19 in Washington State

Examples of public comments and key highlight

Examples of public comments and key highlight

Wordcloud of top keywords in clusters of themes

Wordcloud of top keywords in clusters of themes

Visual representation of initial analysis from dominant words to more nuanced themes

Visual representation of initial analysis from dominant words to more nuanced themes

Testimonial

“Oh my gosh this is amazing. I wish our agency had funding to hire people to do this work. These are the insights that our agency needs to inform policy, approach, communication, how we are engaging, etc. This is fascinating. We were aware of [reciprocity concerns / opportunities], but didn’t know the extent or have the language. Deepest gratitude & appreciation for this team. You were amazing & so helpful.The work & presentation was very informative & extremely helpful. I will be using it & see great value to the rest of the team.”

--Kellie Kahler, Communications Manager, WA State Board of Health

Back to top

Citation

BibTeX citation:
@online{ali2022,
  author = {Ali, Elham},
  title = {Washington {State} {Board} of {Health}},
  date = {2022-04-30},
  url = {https://www.elhamyali.com/},
  langid = {en}
}
For attribution, please cite this work as:
Ali, Elham. 2022. “Washington State Board of Health.” April 30, 2022. https://www.elhamyali.com/.