Keywords
Measles, Pertussis, Whooping cough, epidemiology, health surveillance, vaccination
Vaccine coverage for common infectious diseases such as Measles and Pertussis (also known as whooping cough) have been declining in England and Wales since 2014. Consequently, significant increases in Measles and Pertussis cases are observed in the community.
To explore whether Google Trends offers a predictive utility as a health surveillance tool for Meases and Pertussis in England and Wales.
Google search data related to Measles and Pertussis, including common associated symptoms, were downloaded for 52 weeks from 07/01/2023 – 07/01/2024. Measles and Pertussis case data were retrieved from the weekly Notification of Infectious Disease (NOID) reports.
The associations between searching and case data were explored using a time-series analyses, including cross-correlations, Prais-Winsten regression and joinpoint analysis.
Significant cross-correlations were found for Measles cases and “measles” searching (r=.41) at a lag of -1 week. For Pertussis cases, searching for “whooping cough” (r=.31), “cough” (r=.39), “100 day cough” (r.41) and “vomiting” (r=.42) were significantly correlated at a lag of -3 to -2 weeks. In multivariable regression, “measles” remained significantly associated with Measles cases (β=.24, SE=.33, p=.02) as did “whooping cough” (β=.71, SE=.27, p=.01) and “cough” (β=1.99, SE=.54, p=.001) for pertussis.
Increases in Measles and Pertussis cases follow increases in online searches for both diseases and selected respective symptoms. Further work is required to explore how GT can be used in conjunction with other health surveillance systems to monitor or even predict disease outbreaks, to better target public health interventions.
In England and Wales, declining vaccine coverage for Measles and Pertussis since 2014 has led to a resurgence of these diseases. This study explores the potential of Google Trends (GT) as a predictive health surveillance tool for monitoring Measles and Pertussis (Whooping cough) outbreaks. Google search data related to these diseases and their associated symptoms were analysed with disease cases data. Significant correlations were found between online search trends and disease cases, with Measles cases correlating with searches for "measles" and Pertussis cases with terms like "whooping cough" and "cough." These correlations suggest that increases in online searches precede rises in numbers of new diagnoses of measles and Pertussis, indicating GT could be useful in predicting outbreaks. Integrating GT with traditional surveillance systems could enhance early detection and response to disease outbreaks, enabling more targeted public health interventions. Further research is needed to explore the full potential and limitations of GT in disease surveillance and prediction.
Measles, Pertussis, Whooping cough, epidemiology, health surveillance, vaccination
Vaccinations are one of the most effective public health interventions1. The UK’s childhood vaccination programme is offered up to the age of 5, protecting against 13 infectious diseases2. Since 2014, there has been a decreasing trend in vaccine uptake, which has been exacerbated by the COVID-19 pandemic3. In addition to vaccine uptake, the timeliness of the administration is also critical to prevent outbreaks, particularly for Measles and Pertussis, where evidence suggest delays in administration of the first dose are the biggest predictors of non-completion of the vaccination schedules4. The number of children at 24 months who have completed their first dose of the Measles Mumps and Rubella (MMR) vaccine has recently fallen to 89.2%, down from 90.3% the previous year (with coverage falling below 90% in 61 of the 149 Local Authorities) and those completing the second dose by age 5 down nearly 1% from the previous year5. Between 2022–2023 the prenatal vaccination rate for Pertussis was 60.7%, representing a 4% drop from the year ending 2022 and 7.1% decrease from 20216. Vaccine rates for Measles and Pertussis in the UK are below the 95% target set by the World Health Organisation7.
As a consequence of decreasing vaccination rates, waning vaccine protection and the high transmissibility of both Measles and Pertussis, cases have been steadily rising in England and Wales8. In the last 6 months of 2023 there were 1459 cases of Pertussis and 999 cases of Measles, compared to 428 and 658 in the first 6 months, respectively. These data are also substantially higher compared to the same period in 2022 (337 and 397) and in 2021 (296 and 249)9.
Early identification of outbreaks is critical. Firstly, to implement targeted interventions, preventing further transmission, protecting vulnerable populations, and reducing mortality and morbidity (including reducing secondary infection risk)10. Secondly, it provides a platform for communication and public awareness efforts. Rapid dissemination of information to healthcare providers, communities, and the public is critical for promoting preventative measures, encouraging vaccination, dispelling misinformation - contributing to the broader goal of achieving and maintaining population immunity11. There is a critical need to explore additional mechanisms to traditional health surveillance techniques, particularly those operating in real time, to monitor cases or even predict outbreaks.
There is a growing body of research evidencing of the utility of online health information searching (OHIS) for health research across several domains, exploring; the impact of major disease awareness programmes12–14 the relationship between search data and disease burden measures15 and how outbreak notifications impact OHIS16. OHIS has also been used to predict and monitor disease outbreaks in combination with traditional health surveillance systems and has been shown to accurately predict cases of COVID-1917, HIV18, Zika19, self-harm20 and influenza21. However, OHIS has not been used to predict outbreaks of childhood infectious diseases, such as Measles and Pertussis in the UK.
In this exploratory time-series study, we explored relationships between OHIS for Measles and Pertussis and cases in the context of the current outbreaks in England and Wales, to investigate whether Google Trends (GT) offers a predictive utility as a health surveillance tool.
No patients were involved in the study. Public members contributed to the conception of the study as part of wider Patient and Public Involvement group discussion around vaccine preventable, childhood infectious diseases. Members discussed the common use of internet searching for signs and symptoms in their children prior to seeking healthcare. Members further stressed the importance of this facility as a key trigger for subsequent healthcare seeking and wider recognition of childhood infectious diseases.
Anonymised search data can be retrieved using the GT platform which is free and publicly accessible. Data is provided as a Relative Search Volume (RSV), whereby searches are scaled between 0–100, where 100 represents its maximum popularity in a specified geography and time period22. Duplicate searches, those including special characters or with very low search volumes are omitted. To ensure transparency and the reproducibility of GT-based research, reporting guidelines recommended by Nuti et al. were adhered to23.
In England and Wales, Measles and Pertussis are both notifiable diseases; clinical testing laboratories and medical practitioners are lawfully required to inform the UK Health Security Agency (UKHSA) of microbiologically or symptomatically suspected cases24. Measles and Pertussis case data were retrieved from the weekly Notification of Infectious Disease (NOID) reports published on the government website on 22 January 202425. Cases data were extracted from 01 January 2023 (week 1) to the week of 07 January 2024 (week 52).
Search terms were common synonyms and symptoms of Measle and Pertussis as detailed on the UK National Health Service (NHS) website26,27. Search terms for Measles were “measles”, “fever”, “runny nose”, “blocked nose”, “cough”, “sneezing”, “watery eyes”, sore eyes”, “rash” and “spots”. For Pertussis, search terms were “pertussis”, “whooping cough”, “cough”, “100 day cough”, “vomiting”. Search terms were entered as ‘topics’. GT describes a topic as ‘a group of terms that share the same concept in any language’. The topic feature encompasses searches for relevant subthemes. For instance, “Measles” includes Google Trends data for the search input “Measles symptoms”22. Additionally, the topic feature encompasses linguistic variations, incorporating searches conducted by non-English speakers or in an alternative language. The category filter was set to ‘all’. Data were downloaded from GT 22 January 2024. No ethical approvals were required for this study as all data are publicly available and anonymised.
Time-series cross-correlations were used to explore the relationship between search terms and case data over time. The advantage of using cross-correlations is that they account for the time dependence (lag) in the relationship between two time-series variables. In this case, significant cross-correlation at a lag (week) with a minus value, would suggest searching might predict case data in the subsequent weeks (determined by lag value). Conversely, positive correlations the positive lags will indicate that cases may also predict OHIS, that many weeks later.
To further explore the association between searching and cases, a Prais-Winsten regression analysis was conducted due to the likely presence of serial correlation in the time series data. Variables were selected for the regression analysis if significantly correlated, as determined by Spearman’s Rank test. Univariable and multivariable models were fitted to better understand the predictive utility of respective search terms. Regression results are presented as coefficients and their standard error (SE). A p value of less than .05 was considered significant.
Temporal trends of cases and significantly associated search terms as determined by the Prais-Winsten regression were also explored using a Joinpoint regression analysis. A Weighted Bayesian Information Criterion (WBIC) was used to apply a single joinpoint to case and search term data to identify the week with the most significant trend change. The identification of a joinpoint in searching data in a week that precedes the joinpoint in case data would also be indicative of the predative utility. Analyses were conducted using Statistical Package for the Social Sciences (version 26.0)28 and the Joinpoint Analysis Program (version 5.0.2)29.
The total number of cases for Measles and Pertussis over the 52-week study period (01/01/2023 – 07/01/2024) and the average number of cases per week are presented in Table 1. Measles and Pertussis cases with the RSV for “Measles” and “Pertussis” are illustrated in Figure 1. The RSV for other search terms across the study period are presented in Figure 2.
Number of cases | Mean cases per week (SD) | Range (min-max) | |
---|---|---|---|
Measles | 1657 | 31.87 (15.14) | (9-75) |
Pertussis | 1887 | 36.29 (37.63) | (6-168) |
Linear association between case data and searching was explored using time-series cross correlations as shown in Table 2. For Measles cases, searching for “measles”, was significantly positively correlated from a lag of -1 weeks, through to +3 weeks. “Cough” searching was positively correlated from -4 weeks to cases. “Runny nose”, “blocked nose” and “sore eyes”, showed weak but positive correlations at lag 0. For Pertussis cases, “Pertussis” searching was significantly and positively correlated at -1 weeks onwards. Searching for “whopping cough”, “cough” and “100 day cough” were all significantly positively correlated with cases from -3 weeks to +3 weeks, and “vomiting” from -2 weeks to +2 weeks, respectively.
Spearman’s Rank correlations between cases and search terms are provided in Table 3 and Table 4. Search terms significantly correlated to respective cases were included in the regression analyses. Univariable and multivariable regression analyses (see Table 5) identified individual and combined effect of search terms on Measles and Pertussis case numbers. In the Univariable analysis the search term “Measles” was significantly positively associated with Measles cases (β =.22, SE=.07, p=.002). Search terms “Whooping cough” (β=.85, SE=.12, p<.001), “cough” (β=.62, SE=.21, p=.004) and “100 day cough” (β=1.05, SE=.11, p<.001) all showed significant positive associations with Pertussis cases. Due to a high degree of correlation between “whooping cough” and “100 day cough” (r=.97), only “whooping cough” was included in the multivariable analysis on the basis of its greater average RSV across the study period (13 vs 2, respectively). Analysis revealed significant association between the search term “measles” and measles cases (β=.22, SE=.10, p=.03). Pertussis cases were significantly positively associated with “whooping cough (β=.89, SE=.21, p<.001) and “cough” (β=2.56, SE=.45, p<.001).
Joinpoint analysis (see Table 6) revealed the most significant increase in trends for Measles cases at week 42, however the significant joinpoint for “measles” searching was identified at week 39. For Pertussis, the joinpoint was found to be week 45, as was the joinpoint for “whooping cough”, but for “cough” the significant joinpoint was identified at week 35.
We explored the relationship between OHIS and case data for Measles and Pertussis to assess whether the use of key search terms could support public health surveillance or even predict outbreaks of both diseases. Our analyses strongly suggests that GT provides a predictive utility for the increases in cases for Measles and Pertussis in advance of diagnosis based on disease and symptom searching. For Measles cases, the search term “measles” significantly predicted subsequent cases 1 week later, searching for “spots” offered week predictive utility for cases 3 weeks later. For Pertussis cases, the search term “Pertussis” offered a less robust association with Pertussis cases, however this is likely attributable to it being a less known term in the community. However, the more informal terms “whooping cough” and “100 day cough” showed a much more significant predictive relationship, but the strongest association was found for “cough”. Increased searching for cough was significantly predictive of Pertussis cases 3 weeks later and remained the most significant predictor of cases in multivariable analysis.
There are limitations that need to be considered. Firstly, other events, such as news reports may have impacted searching trends, particularly in the context of Measles and Pertussis outbreaks, however time series analysis in this study suggests that OHIS was occurring before the diagnostic and disease notification process. Secondly, this study only used Google searching, missing out on searching using other platforms, however Google remains the most frequently used search engine in the UK (93.69% of devices)30. Thirdly, calculations applied to obtain RSVs are not publicly available and applied algorithms may obscure the observed trends, however, previous evidence has shown the data to accurately predict other infectious disease outbreaks18 and be comparable to the United States Centre for Disease Control in predicting the timing of influenza outbreaks in the US21. Currently, the provision of location specific searching provided by GT is inadequate for the UK, so data pertain to the national level only. Being able to divide search data geographically, perhaps by county or Integrated Care Board would enhance the predictive utility of GT and elucidate the locations of cluster outbreaks that require urgent response. This would allow for the inclusion of vernacular and terminology specific to different geographies. Finally, this study is retrospective, protocols using GT for health surveillance techniques and outbreak monitoring in real time need to be developed and tested.
This is the first study to explore the predictive utility of GT for measles and Pertussis in England and Wales, however findings are consistent with previous evidence supporting the health surveillance utility of OHIS data. Samaras et al. employed a gamma distribution model showing how OHIS predicts outbreaks of Scarlet Fever 5 weeks before the peak using symptoms and scarlet fever associated search terms31. Furthermore, GT data was utilised by Prasanth et al alongside modelling techniques to accurately forecast COVID-19 incidence, cumulative cases, and deaths at the national level during the pandemic32. GT data has also been used to accurately predict Norovirus cases in the UK and more accurately than other real time data searching sources, including Wikipedia33. Internationally, GT data has been used to predict infectious disease outbreaks at more local levels, forecasting malaria, dengue fever, chikungunya, and enteric fever outbreaks in two districts in India34. In the US, GT has been shown to accurately predict HIV case numbers18 and mpox outbreaks by state35 as well as influenza cases and hospitalisations36.
In addition to changes in trends for disease cases reflecting earlier changes observed in OHIS, the converse was also shown in our time series analysis, where OHIS for Measles and Pertussis were subsequently predicted by increases in cases. This is likely attributable firstly to increasing cases over time and secondly, as those receiving diagnoses begin to search for disease-related or symptom management information37. This is an important finding illustrating the need for the provision of high-quality, evidence-based public health related information to support communities with transmission prevention techniques, healthcare seeking guidance and to prevent the spread and impact of misinformation16. Additionally, it is expected that subsequent engagement with Measles and Pertussis topics take places via alternative digital platforms. Mahroum et al prosed a stimulus-awareness-activism framework, where surges in OHIS are followed by further online and offline discourse and has been evidenced by increased Twitter/X discussion on Chikengunya following outbreaks38.
Not all symptom searching offered the same predictive utility for example “fever” searches offered little predictive utility for Measles cases. This is an important finding as this may reflect how different symptoms present at different times across the course of the disease and possibly how more troubling symptoms lead to higher levels of searching39. Future work exploring GT as a health surveillance tool must identify the symptoms for which searching most accurately associates with case data.
Outside of increasing vaccine uptake in the community for Measles and Pertussis, additional real time surveillance tools to increase the likelihood of the early identification of case increases for Measles and Pertussis is of critical importance. This study suggests OHIS data, as provided by GT, can work as an alternative or adjunct to more traditional health surveillance systems (Royal College of General Practice Weekly Returns, NOIDs etc). Early detection allows for targeted interventions, preventing further transmission, protecting vulnerable populations in the community, and reducing mortality and morbidity10.
Our findings suggest that trends in online health information seeking for Measles and Pertussis precede trends in cases in England and Wales, indicating that GT may provide an additional, real time data source with predictive utility in detecting outbreaks of Measles and Pertussis. Further work is required to explore how GT data can be used in conjunction with other, traditional health surveillance systems to monitor or even predict disease outbreaks, to better target public health interventions.
No Ethics and consent required for the performed study
Figshare.com. Measles and Pertussis case and searching data.
https://doi.org/10.6084/m9.figshare.27014038.v140.
The project contains the following underlying data:
“MeaslesPertussisDataset”. (Measles and Pertussis case and Google Trends symptom search data).
Data are available under the terms of the Creative Commons Zero "No rights reserved" data waiver (CC0 4.0 Public domain dedication).
TS and CM devised the study. TS retrieved all data and ran all analyses. TS led the manuscript writing with contributions from CM. Both authors approved the final version.
Is the work clearly and accurately presented and does it cite the current literature?
Yes
Is the study design appropriate and is the work technically sound?
Yes
Are sufficient details of methods and analysis provided to allow replication by others?
Yes
If applicable, is the statistical analysis and its interpretation appropriate?
Yes
Are all the source data underlying the results available to ensure full reproducibility?
Yes
Are the conclusions drawn adequately supported by the results?
Yes
Competing Interests: No competing interests were disclosed.
Reviewer Expertise: Evaluative studies, Vaccines, Vaccination strategies, Health Economics, Preventive Medicine, Public Health, Epidemiology, Influenza Surveillance, Syndromic Surveillance methods
Alongside their report, reviewers assign a status to the article:
Invited Reviewers | |
---|---|
1 | |
Version 1 07 Oct 24 |
read |
Provide sufficient details of any financial or non-financial competing interests to enable users to assess whether your comments might lead a reasonable person to question your impartiality. Consider the following examples, but note that this is not an exhaustive list:
Sign up for content alerts and receive a weekly or monthly email with all newly published articles
Register with NIHR Open Research
Already registered? Sign in
If you are a previous or current NIHR award holder, sign up for information about developments, publishing and publications from NIHR Open Research.
We'll keep you updated on any major new updates to NIHR Open Research
The email address should be the one you originally registered with F1000.
You registered with F1000 via Google, so we cannot reset your password.
To sign in, please click here.
If you still need help with your Google account password, please click here.
You registered with F1000 via Facebook, so we cannot reset your password.
To sign in, please click here.
If you still need help with your Facebook account password, please click here.
If your email address is registered with us, we will email you instructions to reset your password.
If you think you should have received this email but it has not arrived, please check your spam filters and/or contact for further assistance.
Comments on this article Comments (0)