At the start of the COVID-19 pandemic, we launched a year-long survey of healthcare workers and their families within the Nurses’ Health Study (NHS) and Growing Up Today Study (GUTS) cohorts. Surveys invited participants to ‘add anything else you would like to tell us’ in text responses of unlimited length. We analyzed 18,197 comments (37% of total participants chose to comment) from the last survey launched in March 2021 with the open-source deep learning topic modeling algorithm BERTopic, which leverages transformers neural networks and term frequency-inverse document frequency (TF-IDF) to segregate clusters of comments with similar meaning. We found 85 distinct topics, each identified by the top four associated words and an example comment; upon reviewing topics we categorized them into 13 themes. The results reveal pandemic impacts on the social, political, and physical well-being of participants, including worker burnout, feelings regarding vaccination, and the double burden faced by essential workers caring for children. While this automated technique requires human supervision, it greatly reduces the task of hand-coding large numbers of data entries. The comment topics can be used in mixed methods research to describe study participants’ experiences or predict such factors as loneliness, job quitting or vaccine acceptance.
Lay Abstract
The COVID-19 pandemic affected people’s lives in ways hard to predict at its outset. To understand the impact of the pandemic on healthcare workers and their families, we sent a year-long series of surveys to participants in the Nurses’ Health Study (NHS) and Growing Up Today Study (GUTS). We knew that standard check-box survey questions wouldn’t capture the breadth of people’s experiences, so we invited participants to add their comments in an open box labeled, ‘Please add anything else you would like to tell us.’ In the last (March 2021) survey, we received 18,197 comments. We used artificial intelligence to sort through and categorize similar responses. The algorithm found 85 different topics which fall under 13 general themes. The results reveal widespread pandemic impacts on the social, political, and physical well-being of participants, including worker burnout, feelings regarding vaccination rollout, and the double burden faced by essential workers caring for children. This automated method is far faster than the traditional method of reading and hand-coding each comment, enabling us to use all the comments to detect relevant topics. We hope that policymakers and public health officials can use this method to make informed decisions and predictions about pandemics.
Clinical Implications
Providing patients/participants an open-field comment box enables artificial intelligence to efficiently uncover relevant topics in clinical studies. Combining qualitative information with quantitative responses enables a mixed-methods research approach that can lend context and unexpected insight to standard quantitative methods.