Keywords
Myalgic Encephalomyelitis/Chronic Fatigue Syndrome prevalence; Population bioresource; Electronic health records; Post-viral fatigue syndrome
Progress in Myalgic Encephalomyelitis/Chronic Fatigue Syndrome (ME/CFS) research is being slowed by the relatively small-scale studies being performed whose results are often not replicated. Progress could be accelerated by analyses of large population-scale projects, such as UK Biobank (UKB), which provide extensive phenotype and genotype data linked to both ME/CFS cases and controls.
Here, we analysed the overlap and discordance among four UKB-defined ME/CFS cohorts, and additional questionnaire data when available.
A total of 5,354 UKB individuals were linked to at least one piece of evidence of MECFS, a higher proportion (1.1%) than most prevalence estimates. Only a third (36%; n=1,922) had 2 or more pieces of evidence for MECFS, in part due to data missingness. For the same UKB participant, ME/CFS status defined by ICD-10 (International Classification of Diseases, Tenth Revision) code G93.3 (Post-viral fatigue syndrome) was most likely to be supported by another data type (72%); ME/CFS status defined by Pain Questionnaire responses is least likely to be supported (43%), in part due to data missingness.
We conclude that ME/CFS status in UKB, and potentially other biobanks, is best supported by multiple, and not single, lines of evidence. Finally, we raise the estimated ME/CFS prevalence in the UK to 410,000 using the most consistent evidence for ME/CFS status, and accounting for those who had no opportunity to participate in UKB due to being bed- or house-bound.
Myalgic Encephalomyelitis / Chronic Fatigue Syndrome (ME/CFS) is a highly debilitating and relatively common disease whose causes are unknown. Most of its research compares features (symptoms, molecules, cells or genes, for example) from small numbers of people with ME/CFS, against these features from others without the disease. Studies using small sample sizes have not yielded replicated discoveries that would quicken the pace of ME/CFS research towards effective therapies.
Alternatively, ME/CFS research could take advantage of population-scale bioresources, such as the UK Biobank, which holds diverse genetic, molecular, cellular, imaging and questionnaire data on nearly 500,000 individuals. This has the added advantage of being cheaper than recruiting a new cohort, and generating new data.
For ME/CFS, this approach raises the difficulty of how to best categorise people with this disease, for example by questionnaire response or through linked electronic health records. In the UK Biobank, there are four ways to categorise people with ME/CFS.
This study’s results show that just over 1% of the UK Biobank participants could be categorised as having a ME/CFS medical diagnosis. However, not all evidence for their ME/CFS diagnosis is consistent.
By cross-referencing different data in UK Biobank, we show that a participant’s ME/CFS diagnosis is best supported by two or more lines of evidence. Using the most consistent evidence, and accounting for those who – due to their illness – could not participate in the UK Biobank, we estimate that the UK’s prevalence of ME/CFS is 410,000.
Myalgic Encephalomyelitis/Chronic Fatigue Syndrome prevalence; Population bioresource; Electronic health records; Post-viral fatigue syndrome
Pathomechanisms of myalgic encephalomyelitis (ME; also known as chronic fatigue syndrome, CFS) are unknown. ME/CFS is a highly disabling, multi-system disorder affecting about 65 million people worldwide1, although this is rising steeply due to symptoms of many people with Long Covid meeting ME/CFS diagnostic criteria2. Approximately two-thirds of ME/CFS cases report an infection at onset of their symptoms3–5. The hallmark symptom of ME/CFS is post-exertional malaise, when new symptoms arise, or previous ones worsen, following even minimal exertion that are disproportionate to the level of exertion, prolonged in recovery and not immediately alleviated by rest6. Additional symptoms are fatigue that does not go away with rest, pain including headache, cognitive dysfunction and multiple sensitivities7. ME/CFS is a female-biased disease with about five-times more women affected than men7. The annual economic impact of ME/CFS is $36–51 billion in the USA alone8.
Among patients’ and clinicians’ priorities for future ME/CFS research is the identification of ME/CFS diagnostic biomarkers and risk factors9. Little progress on this has been made, with biomarker and genetic studies yielding results that are mostly not replicated subsequently10,11. One recent strategy for investigating ME/CFS pathomechanisms involved an in-depth analysis of 17 ME/CFS cases, a project costing $8 million12. An alterative approach is to develop national cohorts of well-phenotyped and genotyped ME/CFS cases, such as DecodeME in the UK, a study costing £3.2 million13. A quicker and cheaper approach, however, is to analyse pre-existing data from national biobanks that have recruited hundreds-of-thousands of participants14. For this, data access costs can be modest, for example £9,000 for 3-year access to UK Biobank (UKB) data on approximately 500,000 people.
UKB’s open access strategy for health-related data has already yielded an academic dividend of 10,000 scientific papers, with many new discoveries being made annually15,16. Its diverse data include whole-genome sequences, brain, heart and full-body magnetic resonance imaging, blood biochemistry markers, activity monitors and online questionnaires17. UKB data was acquired using standard protocols, and has been subjected to detailed quality control procedures. Disadvantages of UKB are its healthy participation and sociodemographic biases18.
UKB has been used to identify ME/CFS genetic risk factors, using data from approximately 2,000–2,500 ME/CFS participants. However, these studies revealed few genome-wide significant findings and these failed to replicate across analyses11,19–22. This lack of discovery could be explained by low predictive power from too few cases, and/or phenotypic misclassification arising from an overly permissive case definition.
As we show below, 5,354 individuals in the UKB are linked to 1 or more pieces of evidence of ME/CFS. Future UKB studies may thus wish to draw upon data from these additional individuals, and to refine ME/CFS case definition, in order to identify genetic, molecular, imaging, or activity data separating ME/CFS cases from population controls.
Here, our aim was to aid the definition of ME/CFS caseness in UKB analyses by investigating the consistency, or otherwise, of UKB participants’ ME/CFS status, defined in four different ways. We evaluate each of the 4 cohorts in turn, in relation to their concordance or discordance with the 3 other cohorts, before interpreting which of their cohort combinations might provide most scientific value in future studies.
Public consultations on biobanking began in 2000, before funding for UK Biobank was obtained, and have continued since. UK Biobank actively engages with its participants through questionnaires, annual newsletters, follow-up studies and participation in other research activities. UK Biobank employed focus groups and a lay Advisory Panel/Community Advisory Group to advise them on study design and recruitment. No patients/participants – other than those in UK Biobank – were directly involved in this study.
Data was downloaded onto the UK Biobank’s Research Analyst Platform (UKB-RAP) on 23rd March 2023. This project was refreshed on 31st August 2023 and all analyses were performed on 20th February 2024, except for the analysis of the Experience of Pain questionnaire symptom/comorbidity responses which was conducted between 22nd–29th April 2024 when cohort numbers remained unaltered from the previous analyses. UK Biobank field identifiers are provided in Extended Data.
On the same day, at their visit to the UKB assessment centre, yet prior to the verbal interview, participants completed a touchscreen questionnaire featuring a question about their overall health: “In general how would you rate your overall health?”. Participants were able to choose the following response options: ‘Poor’, ‘Fair’, ‘Good’, ‘Excellent’, ‘Do not know’, and ‘Prefer not to Answer’.
In a touchscreen questionnaire, participants were asked “Have you been told by a doctor that you have other serious illnesses or disabilities?”23. Responses to these questions were provided to the interviewer, a nurse. Participants were then asked “You selected that you have been told by a doctor that you have other serious illnesses or disabilities, could you tell me what they are?”24. One such response option was ‘Chronic fatigue syndrome’; neither ME nor ME/CFS responses were available.
In 2019–20, 167,184 UKB participants (94,988 females, 72,196 males) who completed the PQ, answered the question “Have you ever been told by a doctor that you have ME/CFS?”25 Here, we chose 15 additional PQ questions that are relevant to ME/CFS symptomatology. Questions 1–12 had no missing data. These were 6 questions with possible responses ‘yes’ (considered as ‘affected’), or ‘no’, ‘do not know’, ‘prefer not to answer’ (assumed to be ‘unaffected’ for this analysis): (1) “Have you ever been told by a doctor you have had Fibromyalgia syndrome?” (or “Ever had fibromyalgia”); (2) “Have you ever been told by a doctor that you have had migraine?” (“Ever had migraine”); (3) “Are you troubled by pain or discomfort, either all the time or on and off, that has been present for more than 3 months?” (“Troubled by pain or discomfort for >3 months?”); (4) “During the past 6 months have you had headaches?” (“Headache in past six months”); (5) “Do you have persistent or recurrent tiredness, weariness or fatigue that has lasted for at least 6 months?” (“Persistent fatigue for at least 6 months?”); and, (6) “Have you suffered from fatigue or exhaustion in the last week?” (“Fatigue or exhaustion in last week?”).
The next 6 questions were on symptom severity with possible responses ‘mild’, ‘moderate’ or ‘severe’ (considered as ‘affected’), or ‘no problem’ or ‘prefer not to answer’ (considered as ‘unaffected’); or on symptom frequency with possible responses ‘several days’, ‘more than half the days’ and ‘nearly every day’ (considered ‘affected’), and ‘not at all’ or ‘prefer not to say’ (considered ‘unaffected’): (7) “Indicate the level of severity over the past week of waking unrefreshed” (“Waking unrefreshed severity over the past week”); (8) “Indicate the level of severity over the past week of cognitive symptoms. For example, problems with memory, thinking, skills and/or concentration” (“Cognitive symptoms severity over the past week”); (9) “Please click the ONE box that best described your health TODAY: Mobility” (“Mobility problems today”); (10) “Please click the ONE box that best describes your health TODAY: Self-care” (“Self-care problems today”); (11) “Please click the ONE box that best describes your health TODAY: Usual activities” (“Problems doing usual activities”); and, (12) “Over the last 2 weeks, how often have you been bothered by feeling tired or having little energy?” (“Recent feelings of tiredness or low energy”).
The final 3 questions are related to post-exertional malaise, with possible responses: ‘yes’ (considered ‘affected’), ‘no’, ‘do not know’ or ‘prefer not to answer’ (all considered ‘unaffected’). These questions were only asked to participants who answered ‘yes’ to question (5) above: ‘Do you have persistent or recurrent tiredness, weariness or fatigue that has lasted for at least 6 months?’ including to 1,876 (69%) of those in Cohort 2. These 3 questions were: (13) “Do you get tired after minimal physical or mental exertion?” (“Tired after minimal physical or mental exertion”); (14) “Does this tiredness, weariness or fatigue go away when you rest?” (“Fatigue goes away when resting”); and, “Is this tiredness, weariness or fatigue happening only because you have been exercising and/or working too much?” (“Fatigue only because of exercising or working too much”).
UKB was given access to hospital inpatient records26 for 446,974 individuals in their cohort (88%; 244,747 females, 202,227 males). Of 1,407 with a G93.3 ICD-10 code, few (6%; 87) had G93.3 listed as the main reason for admission with nearly all (97%; 1,361) having G93.3 listed under a secondary code; 41 (3%) have G93.3 listed as both a main and secondary code. Also, the UKB was granted access to primary care records27 for 230,055 of its 502,364 participants (46%). Primary care codes are split into CTV3 codes and ReadV2 codes depending on the GP computer system supplier (UK Biobank 2019). In total, 1,818 individuals in the UKB have been linked to both CTV3 and ReadV2 codes (1,021 females, 797 males); these duplicate individuals are only counted once. ME/CFS referral codes were not included because referrals may not necessarily lead to a diagnosis.
The 4 ME/CFS cohorts investigated were:
Cohort 1 (C1): ‘Self-reported CFS’ – those who self-reported a clinical diagnosis of CFS in the verbal interview at a visit to the UKB assessment centre.
Cohort 2 (C2): ‘ME/CFS (PQ)’ – those who self-reported clinical diagnosis of ME/CFS in the ‘Experience of Pain’ questionnaire (PQ25; 2019–20).
Cohort 3 (C3): ‘ICD-10:G93.3’ – those with this International Classification of Diseases tenth revision (ICD-10) ‘Post-viral fatigue syndrome’ (PVFS) code in hospital inpatient records. Notably, post-exertional malaise is not required for individuals to receive this code. Further, people with ME/CFS without an infection prior to disease onset will not be given this code. Consequently, PVFS is similar to but not equivalent to ME.
Cohort 4 (C4): ‘GP code for ME/CFS’ – those with a ME/CFS code in their primary care (General Practice) records.
For UKB data fields used in these analyses, see Extended Data.
Lack of overlap between cohorts could be due to inconsistencies in evidence of an ME/CFS diagnosis. It may also be due to missing data: only subsets of UKB participants (i) completed the PQ, (ii) have undergone a hospital admission (and consequently have hospital inpatient data) and/or (iii) have available primary care records. Inconsistencies may also be due to asynchrony between data sources: (i) an ME/CFS diagnosis might not have been captured in ICD-10 codes as these date back to 1981–1998 depending on health service provider, and diagnosis may have preceded this date; (ii) recent ME/CFS diagnoses might not be present in primary care codes due to these being extracted for UKB in 2016–2017; and, (iii) a participant’s EHR may record a ME/CFS diagnosis despite them not previously self-reporting ME/CFS, when their diagnosis postdates their last self-reporting opportunity.
We also investigated apparent inconsistencies between evidence of self-reported CFS diagnosis and each participant’s self-reported overall health at entry into UKB. Our reasoning is that those with evidence of a diagnosis and yet who report ‘good’ or ‘excellent’ overall health might have misreported or been miscoded as self-reported CFS, or were recovered, misdiagnosed or in remission.
A total of 2,312 (0.46%) of 502,364 UKB participants (273,297 female, 229,067 male), self-reported a clinical diagnosis of CFS in a verbal interview at one or more visit to the UKB assessment centre. This cohort (C1) has 1,685 females (0.62%), and 627 males (0.27%), a female bias of 2.29:1. Surprisingly, 44%-69% of these failed to re-report CFS at subsequent visits to the UKB. As expected, those who self-reported a CFS diagnosis are significantly less likely to report ‘excellent’ or ‘good’ overall health, and significantly more likely to report ‘poor’ or ‘fair’ overall health, when compared to those who did not self-report a diagnosis of CFS (p<10-5; Table 1). Most of C1 reported ’poor’ or ‘fair’ overall health. Unexpectedly, however, 28% reported ‘good’ or ‘excellent’ overall health when they first volunteered their CFS diagnosis (Table 1).
Numbers and percentages of those who self-reported CFS at visits to the UKB assessment centre, stratified by gender (top). Percentages of those who did not self-report CFS at their first visit to the UK Biobank centre are also shown (bottom). Overall health ratings are as reported at the time an individual first reported CFS. Ambiguous responses ‘Do not know’ and ‘Prefer not to answer’ (n=22) are excluded. A small number of UKB participants failed to complete this question.
C1 has variable evidence of ME/CFS from other sources, when available. Among those who completed the PQ, most (89%, n=850) provided a concordant response, answering ‘yes’ to having a clinical diagnosis of ME/CFS. However, only one-third of C1 with hospital inpatient records linked in UKB had an ICD-10:G93.3 code: 714 of 2,148 (33%; 531 females, 183 males). Further, only half (55%; n=617) of C1 have a ME/CFS code in their UKB primary care records: 431 females and 186 males. In all, over one-third (36%) of C1 have no further evidence of ME/CFS (602 females, 220 males), although this is in part due to data missingness. Thirty-nine percent of C1 (894 people) have 1 further piece of evidence, 22% (501) have 2, and 4% (95) have all 3.
Among 167,184 participants who completed the PQ, 2,720 (1.63%) are in cohort 2 (C2), having answered ‘Yes’ to “Have you ever been told by a doctor that you have ME/CFS?” 1,929 of 94,988 were female (2.0%) and 791 of 72,196 (1.1%) were male, a female-bias of 1.85:1. Among those saying ‘Yes’ are 850 who are also present in C1 (see above) leaving 1,870 cases newly-reported in C2 (1,299 female, 571 male).
Among those in C2, UKB has access to hospital inpatient data for 2,527 (92%) and primary care code records for 1,315 (48%). Of those with available data, only 459 (18%) have an ICD-10:G93.3 code and 441 (34%) have a ME/CFS code in their primary care records. Just over half (57%, n=1,556) of C2 have no further evidence of ME/CFS (i.e., are not in C1, C3 or C4), in part due to missingness in the data. A quarter (673; 25%) have 1 further piece of evidence, 15% (396) have 2 further pieces of evidence, and 3% (95) have all 3.
Cohort C2 were significantly more likely than those in neither C1 nor C2 (‘controls’; C1C ∩ C2C) to have: been diagnosed with fibromyalgia (16-fold higher), been diagnosed with migraine (2.0-fold higher), been troubled by pain or discomfort for >3 months (1.4-fold), experienced headaches in the past 6-months (1.7-fold), experienced persistent or recurrent tiredness, weariness or fatigue that has lasted over 6-months (3.5-fold), and experienced fatigue or exhaustion in the past week (2.3-fold); all p<10-5 (Figure 1A). ME/CFS cohort C2 was also significantly more likely than controls to have experienced mild, moderate or severe problems with: unrefreshing sleep in the past week (1.6-fold higher), cognitive difficulties in the past week (1.9-fold), mobility problems today (1.9-fold), self-care problems today (3.0-fold), problems with usual activities today (2.0-fold), and frequent problems with tiredness or little energy in the last 2 weeks (1.7-fold; all p<10-5) (Figure 1B).
Higher proportions of 6 comorbidities (A) and 6 symptoms (B) in a ME/CFS cohort than among controls. Cohort 2 (C2, ME/CFS, PQ; blue) versus those answering ‘No’ in the PQ and who never self-reported CFS in the verbal interviews (‘controls’, C1C ∩ C2C; orange). Note that 2,066 individuals who answered 'I don't know' and 19 who answered 'Prefer not to answer' to the ME/CFS question in the PQ are not included among controls due to their ambiguous response. (A) C2 individuals were significantly more likely than controls to have been diagnosed with fibromyalgia (21.8% versus 1.3%); migraine (37.5% versus 18.9%); to have been troubled by pain or discomfort for >3 months (77.2% versus 55.5%); experienced headaches in the past 6 months (56.4% versus 33.2%); experienced persistent or recurrent tiredness, weariness or fatigue that has lasted over 6 months (68.9% versus 19.9%); and experienced fatigue or exhaustion in the past week (73.9% versus 32.0%); all p<10-5 (χ2-test). (B) C2 individuals were significantly more likely than controls to have experienced mild, moderate or severe problems with: unrefreshing sleep in the past week (86.0% versus 54.3%), cognitive difficulties in the past week (69.2% versus 36.4%), mobility problems today (52.2% versus 28.2%), self-care problems today (25.4% versus 8.5%), problems with usual activities today (61.3% versus 31.2%), and were more likely to experience feeling tired or having little energy in the past 2 weeks (86.1% versus 50.5%) (all p<10-5; χ2-test).
Among cohort C2 were 1,876 (69.0%) who answered ‘yes’ to ‘Do you have persistent or recurrent tiredness, weariness or fatigue that has lasted for at least 6 months?’ These UKB participants then answered 3 questions that are relevant to post-exertional malaise. They were, as expected, significantly more likely than those in (C1C ∩ C2C) controls to report getting tired after minimal physical or mental exertion (1.6-fold higher), and significantly less likely to report that their tiredness goes away when resting (0.53-fold lower), or that their tiredness was only due to over-exertion (0.42-fold lower; all p<10-5; Figure 2). Just under two-thirds (394 of 617; 64%) of those in both C1 and C2 answered these 3 questions in a manner consistent with post-exertional malaise (i.e., ‘Yes’, ‘No’, ‘No’; Figure 2), higher than the 57% (1,073 of 1,876) in C2 who answered these in the same way. These questions are thus valid for further refining ME/CFS caseness.
Questions were asked in the UK Biobank Experience of Pain Questionnaire (PQ). The ME/CFS (PQ) cohort C2 (dark blue) was significantly more likely than C1C ∩ C2C controls (orange) to report becoming tired after minimal physical or mental exertion (76.1% versus 48.2%, respectively), significantly less likely to report that their tiredness goes away when resting (25.2% versus 47.6%, respectively), and significantly less likely to report that their tiredness is only due to over-exertion (6.9% versus 16.6%, respectively); all p<10-5. Individuals in both cohorts 1+2 (concordant cohort: C1 and C2; C1 ∩ C2; grey) more often reported feeling tired after minimal physical or mental exertion, than those in cohort 2 but not 1 (C2 ∩ C1C; yellow; 80.6% versus 74.0%, respectively) or in cohort 1 but not 2 (C1 ∩ C2C; light blue; 80.6% versus 65.0%, respectively). Similarly, those in C1 ∩ C2 (grey) were less likely than those in (C2 ∩ C1C), or (C1 ∩ C2C) (20.4% versus 27.5% or 31.8%, respectively) to report that their tiredness went away when resting. The discordant cohort (C1 not C2: C1 ∩ C2C; light blue; i.e., self-reported CFS in verbal interview but answered ‘No’ or 'do not know' or 'prefer not to answer' in the PQ) was significantly more likely to report that their tiredness was due only to over-exertion than the concordant cohort (C1 ∩ C2; grey): 12.7% versus 5.8%, respectively. Asterisks denote significant differences between cohorts (χ2 test; p<10-5 ***; p<0.01 **; p<0.05 *; n/s, not significant at p<0.05).
In total, 446,974 individuals (244,747 female, 202,227 male) in the UKB underwent 1 or more hospital admission and so have linked hospital inpatient codes. Of these, 1,407 (0.31%) have an ICD-10 code for ‘Post-viral fatigue syndrome’ (G93.3); 1,048 are female (0.43%) and 359 male (0.18%), giving a female-to-male ratio of 2.4:1.
Among those in cohort 3 (C3), 51% (714) are also in C1 (‘self-reported CFS’); 88% (459) are in C2 (Pain Questionnaire), and 60% (412) have a Primary Care code for ME/CFS (i.e., are in cohort C4; see below), when data is available. Altogether, among 1,407 in C3 (with an ICD-10:G93.3 code), 398 (28%) lack further evidence of ME/CFS, 528 (38%) have 1 further piece of evidence, 386 (27%) have 2 further pieces of evidence and 95 (7%) have 3 further pieces of evidence.
Among all UKB participants, 230,055 have linked primary care records of whom 1,575 (0.68%; cohort C4) have a ME/CFS code: 1,071 have a CTV3 code, 505 have a Readv2 code and 1 has both types of code. C4 contains 1,109 females (0.88%) and 466 males (0.45%), giving a female-to-male ratio of 1.96:1.
The clinical diagnoses of people in C4 allowed us to assess the diagnostic validity of C1-3. We defined concordances between cohorts using only those individuals for whom all relevant data was available. Concordance with C4 was highest for C2 (71%), followed by C1 (39%), and lowest for C3 (28%). Despite their known clinical diagnosis in primary care, 61%, 29% and 72% of UKB participants in C1, C2 and C3, respectively, thus failed to self-report CFS in the verbal interview, failed to self-report ME/CFS in the PQ or failed to have linked hospital inpatient data, respectively. Among 958 individuals in C4 who did not self-report CFS in the verbal interview, 32% (302) have additional evidence of ME/CFS from the PQ or ICD-10:G93.3 code.
In all, 656 (42%) of those in C4 have no further evidence of ME/CFS, 463 (29%) have 1 further piece of evidence, 361 (23%) have 2 further pieces of evidence and 95 (6%) have 3 further pieces of evidence.
The prevalence of ME/CFS in C1-4 varied between 0.31% and 1.63% for both sexes combined, 0.18%-1.1% for males, and 0.43%-2.0% for females (Table 2). In the UKB, there are 5,354 individuals with 1 or more pieces of evidence for ME/CFS: 3,432 (64%) have 1 piece of evidence, 1,279 (24%) have 2 pieces of evidence, 548 (10%) have 3 pieces of evidence and 95 (2%) have all 4 pieces of evidence (Table 3).
Among 5,354 individuals in the UKB with at least 1 piece of ME/CFS evidence: 3,432 have 1 piece of evidence, 1,279 have 2 pieces of evidence, 548 have 3 pieces of evidence and 95 have all 4 pieces of evidence. Each of the 5,354 participants is counted once only in this table.
Of 5,354 individuals in the UKB with evidence of a ME/CFS diagnosis, two-thirds (3,432; 64%) have only 1 piece of evidence. Also, each of the four pieces of evidence for ME/CFS is not strongly supported by any other (Figure 3). The exceptions are the ME/CFS-PQ cohort (C2), whose evidence supports membership of each of the 3 other cohorts at >70%, when complete data is available, and the ICD10:G93.3 cohort (C3) of whom 72% have supporting evidence from other data sources. Conversely, membership of each of C1, C3 or C4 poorly supports C2 membership. Here we speculate why there are concordances and discordances among the 4 cohorts including with respect to likely biases in self-reported and electronic health record data.
This considers only individuals with available data (i.e., no missingness). Each bar indicates the percentage of the cohort (e.g. C1 – left) who are also members of a second cohort (e.g. C2 – light blue), accounting for missing data (i.e., only considering those with linked data required for specifying C1 and C2 membership). A total of 850 are members of both C1 and C2 cohorts. Among those for whom data was available, this is 31% of C1 members, and 89% of all C2 members, a discrepancy that could reflect participation bias in completion of the Pain Questionnaire and/or that its completion was approximately 10 years after the baseline questionnaire. Unlike other values, the ‘No further data’ values (dark blue) do not account for data missingness. Significance testing was not employed due to data overlap and missingness between cohorts.
Membership of cohorts’ C1-4 will be inflated when ‘Chronic fatigue’ has been wrongly conflated with ‘Chronic fatigue syndrome’ by clinicians or UKB participants and nurses. ‘Chronic Fatigue’ occurs at much higher prevalence than ‘Chronic Fatigue Syndrome’28,29 yet, importantly, does not require hallmark ME/CFS symptoms such as post-exertional malaise. This conflation may explain why, in one study, only about one-third of GP-diagnosed ME/CFS cases met a more stringent ME/CFS case definition30. Membership will also be inflated, relative to the general UK population, due to UKB’s sociodemographic biases including age, ethnicity and gender16,18 which all favour increased rates of ME/CFS diagnosis31. Cohort 2 membership was also inflated due to participation bias: those in C1 were significantly more likely to complete the subsequent optional online PQ than others (41% versus 33%, respectively; p<10-5).
Cohort C1 (n=2,312), without further filtering, does not have strong validity for ME/CFS research: it has unexpectedly high levels of individuals reporting ‘good’ or ‘excellent’ overall health (28%), many (44%–69%) fail to re-report CFS at subsequent visits to the UKB, 11% (n=105) who also completed the PQ provided a discordant response when asked about a clinical ME/CFS diagnosis, and there were high levels of discordance with hospital inpatient and primary care data. It is thus plausible that some within this cohort have mis-reported a diagnosis of CFS, been self-diagnosed (rather than receiving a clinical diagnosis), been diagnosed with ‘Chronic Fatigue’ rather than ME/CFS, been misdiagnosed with ME/CFS previously and since received an alternative appropriate diagnosis, or CFS has been erroneously recorded during the verbal interview. Previous analyses that used C1 are thus called into question. For example, genetic studies of ME/CFS, which to be well-powered require a homogeneous population, have not yielded consistent findings using this cohort20,32–36. These studies also used up to 3,042 UKB participants as controls who nevertheless have evidence of ME/CFS (i.e., those in C2–4, but not C1). These individuals would have been included as controls in previous ME/CFS analyses, likely weakening their power to detect true associations. We recommend filtering C1 cases by removing (i) those with ‘good’ or ‘excellent’ overall health ratings, and (ii) those who provided a discordant response in the PQ. This strategy was employed in a separate analysis37 which resulted in a larger yield of ME/CFS-associated blood traits.
Cohort C2 (n=2,720) might be considered to have strong validity for ME/CFS research because it is defined by the explicit question “Have you ever been told by a doctor that you have ME/CFS?” Also, those in C2 reported levels of fatigue, muscle pain, headaches, unrefreshed sleep and cognitive difficulties, and a comorbid fibromyalgia diagnosis, similar to those reported in a separate large-scale UK study covering adults of all ages4. Nevertheless, for two reasons we caution against C2 being used for ME/CFS research without further refinement. First, nearly one-third (844; 31.0%) of C2 did not report ‘persistent or recurrent tiredness, weariness or fatigue that has lasted over 6 months’ (p120114, Extended Data), which is a core symptom for ME/CFS diagnosis6. Second, not everyone in C2 appears to experience post-exertional malaise: 448 (24%) of this cohort do not report ‘feeling tired after minimal physical or mental exertion’ (p120117, Extended Data), 472 (25%) report that their ‘tiredness goes away when resting’ (p120115, Extended Data) and 130 (6.9%) report their ‘tiredness is only due to over-exertion’ (p120116, Extended Data). Overall, less than half (43%) of those in C2 are also in one or more of the other 3 cohorts. Thus, one possible strategy for filtering C2 is to only consider those who provided responses consistent with post-exertional malaise: namely the 1,073 individuals who answered ‘yes’, ‘no’, ‘no’ to p120117, p120115, p120116, respectively. Nevertheless, these three questions may not effectively define post-exertional malaise because they only capture ‘tiredness’, as opposed to a flare up of ‘all symptoms’, they overlook ‘new symptoms’ which are characteristic of post-exertional malaise6, and they do not account for people with ME/CFS pacing (e.g., ‘resting’ and not ‘over-exerting’). Otherwise, a more permissive approach would be to include only those who answered ‘yes’ to p120117 (n=1,428).
The largest intersection between cohorts is between C1 and C2 (n=850; Table 3). These individuals are slightly more likely to report ME/CFS symptoms, relative to those in C2 yet who did not self-report CFS at baseline (i.e., not in C1): they are significantly more likely to be tired after minimal physical or mental exertion (81% vs 74%), and also less likely that their tiredness goes away after rest (20% vs 27%; Figure 2). Further requiring participants to have an ICD-10:G93.3 or ME/CFS primary care code reduces their number considerably, to 187 or 162, respectively. Whilst this provides cohorts with support from health record data, their numbers dwindle to low case numbers (approximately 20–40) when further intersected with UKB participants with plasma protein, or brain image or accelerometry data, for example38–40, which would substantially weaken statistical power in such studies. Mandating participants to have at least 1 piece of self-report data (C1 or C2) and at least 1 piece of EHR data (C3 or C4) boosts case number to 1,398, a quarter (26%) of all UKB ME/CFS (C1–4) cases.
Cohort C3 has most support from other cohorts (72%). Nevertheless, using C3 without further filtering is not advised as some may have ‘Post viral Fatigue Syndrome’, which does not require post-exertional malaise, rather than ME/CFS which does. Thus, we recommend restricting C3 to the 1,009 individuals who also fall within one of the other cohorts: C1, C2 or C4. A further caveat to consider is that C3 will likely not include the one-third of people with ME/CFS who do not have an infection prior to the onset of their symptoms7. It is also the smallest cohort: 26% of all individuals in C1–4 (Table 2); this fraction reduces to 19% when the recommended filters are applied.
As a clinically-defined cohort, C4 has strong validity and numbers are relatively high (n=1,575). Nevertheless, we recommend removing those with discordant PQ data, specifically ~29% of participants (n=177) who completed the PQ yet failed to answer ‘Yes’ to the question: “Have you ever been told by a doctor you have ME/CFS?” Doing so, reduces this cohort to about 1,120 (of 230,055 with GP records), or a UKB prevalence of 0.49%. Extrapolating this percentage to the UK population41 predicts 330,000 people with a life-time risk of an ME/CFS diagnosis. This is lower than an estimate of 404,000 that assumed equal access to diagnosis, which is far from being achieved31.
We recommend that control cohorts exclude C1-4 members and remove those providing an ambiguous response in the PQ or who have ME/CFS referral codes in their primary care records.
This analysis benefits from the diversity of data types being compared for the same population-scale cohort, UKB, rather than for other biobanks, such as FinnGen and the Estonian Biobank, which link to ME via ICD-10 codes only42,43. The UKB ME cohort is the largest globally. The All of Us CFS cohort44 might appear to be larger (at 8,600 participants) but 93% of this cohort are linked to chronic fatigue, unspecified (R53.82 ICD-10CM) rather than to ME or PVFS. This study’s weaknesses are that missing UKB data is assumed to be missing completely at random, and that the healthy participation bias inherent in the UKB18 will result in disproportionately few people participating in UK who were severely- or even moderately-affected by ME/CFS. Accounting for the estimated one-quarter of people with ME/CFS who had no opportunity to participate in UKB due to being bed- or house-bound45, our estimated ME/CFS prevalence in the UK then rises from 330,000 to 410,000 (0.6%).
UK Biobank has approval from the North West Multi-centre Research Ethics Committee (MREC) as a Research Tissue Bank (RTB) approval (2011, renewed 2016 and 2021). This approval means that researchers do not require separate ethical clearance and can operate under the RTB approval.
The basis for consent in UK Biobank rests on participants' explicit and informed consent. UK Biobank uses “legitimate interests” as the primary lawful basis on which to process personal data under the UK GDPR.
UK Biobank data is available upon application via their website46 https://www.ukbiobank.ac.uk/enable-your-research/apply-for-access. UK Biobank research fields are provided in the Main Text and in Extended Data Tables 1–2, which are freely available from the Open Science Framework47: https://osf.io/rp5ca.
Gemma Louise Samms: Conceptualization, Data curation, Investigation, Writing – original draft, Writing – review & editing. Chris P. Ponting: Conceptualization, Funding acquisition, Supervision, Writing – original draft, Writing – review & editing.
This research has been conducted using the UK Biobank Resource under Application Number 76173. This work uses data provided by patients and collected by the NHS as part of their care and support. Our sincere thanks to the participants and researchers of the UK Biobank who made this effort possible. GLS was funded by a PhD studentship from ME Research UK. For this project, CPP was jointly funded by NIHR and MRC (grant number MC_PC_20005).
Is the work clearly and accurately presented and does it cite the current literature?
Yes
Is the study design appropriate and is the work technically sound?
Yes
Are sufficient details of methods and analysis provided to allow replication by others?
Yes
If applicable, is the statistical analysis and its interpretation appropriate?
Yes
Are all the source data underlying the results available to ensure full reproducibility?
Yes
Are the conclusions drawn adequately supported by the results?
Yes
Competing Interests: No competing interests were disclosed.
Reviewer Expertise: Genetic Epidemiology
Is the work clearly and accurately presented and does it cite the current literature?
Yes
Is the study design appropriate and is the work technically sound?
Yes
Are sufficient details of methods and analysis provided to allow replication by others?
Yes
If applicable, is the statistical analysis and its interpretation appropriate?
Yes
Are all the source data underlying the results available to ensure full reproducibility?
Yes
Are the conclusions drawn adequately supported by the results?
Yes
Competing Interests: No competing interests were disclosed.
Reviewer Expertise: My research explores the role of mitochondrial DNA (mtDNA) variation in complex traits as well as the impact of mtDNA haplogroup background in the diagnosis of primary mitochondrial diseases (PMD). My group's finding have shown that ME/CFS is not a PMD, as ME/CFS patients do not exhibit the pathogenic mtDNA mutations typically seen in those disorders. However, our work also indicates that common mtDNA variants present in the general population may influence ME/CFS.
Alongside their report, reviewers assign a status to the article:
Invited Reviewers | ||
---|---|---|
1 | 2 | |
Version 1 28 Apr 25 |
read | read |
Provide sufficient details of any financial or non-financial competing interests to enable users to assess whether your comments might lead a reasonable person to question your impartiality. Consider the following examples, but note that this is not an exhaustive list:
Sign up for content alerts and receive a weekly or monthly email with all newly published articles
Register with NIHR Open Research
Already registered? Sign in
If you are a previous or current NIHR award holder, sign up for information about developments, publishing and publications from NIHR Open Research.
We'll keep you updated on any major new updates to NIHR Open Research
The email address should be the one you originally registered with F1000.
You registered with F1000 via Google, so we cannot reset your password.
To sign in, please click here.
If you still need help with your Google account password, please click here.
You registered with F1000 via Facebook, so we cannot reset your password.
To sign in, please click here.
If you still need help with your Facebook account password, please click here.
If your email address is registered with us, we will email you instructions to reset your password.
If you think you should have received this email but it has not arrived, please check your spam filters and/or contact for further assistance.
Comments on this article Comments (0)