Skip to content
ALL Metrics
-
Views
66
Downloads
Get PDF
Get XML
Cite
Export
Track
Research Article

Defining a High-Quality Myalgic Encephalomyelitis/Chronic Fatigue Syndrome cohort in UK Biobank

[version 1; peer review: 2 approved]
PUBLISHED 28 Apr 2025
Author details Author details
OPEN PEER REVIEW
REVIEWER STATUS

Abstract

Background

Progress in Myalgic Encephalomyelitis/Chronic Fatigue Syndrome (ME/CFS) research is being slowed by the relatively small-scale studies being performed whose results are often not replicated. Progress could be accelerated by analyses of large population-scale projects, such as UK Biobank (UKB), which provide extensive phenotype and genotype data linked to both ME/CFS cases and controls.

Methods

Here, we analysed the overlap and discordance among four UKB-defined ME/CFS cohorts, and additional questionnaire data when available.

Results

A total of 5,354 UKB individuals were linked to at least one piece of evidence of MECFS, a higher proportion (1.1%) than most prevalence estimates. Only a third (36%; n=1,922) had 2 or more pieces of evidence for MECFS, in part due to data missingness. For the same UKB participant, ME/CFS status defined by ICD-10 (International Classification of Diseases, Tenth Revision) code G93.3 (Post-viral fatigue syndrome) was most likely to be supported by another data type (72%); ME/CFS status defined by Pain Questionnaire responses is least likely to be supported (43%), in part due to data missingness.

Conclusions

We conclude that ME/CFS status in UKB, and potentially other biobanks, is best supported by multiple, and not single, lines of evidence. Finally, we raise the estimated ME/CFS prevalence in the UK to 410,000 using the most consistent evidence for ME/CFS status, and accounting for those who had no opportunity to participate in UKB due to being bed- or house-bound.

Plain Language Summary

Plain English summary

Myalgic Encephalomyelitis / Chronic Fatigue Syndrome (ME/CFS) is a highly debilitating and relatively common disease whose causes are unknown. Most of its research compares features (symptoms, molecules, cells or genes, for example) from small numbers of people with ME/CFS, against these features from others without the disease. Studies using small sample sizes have not yielded replicated discoveries that would quicken the pace of ME/CFS research towards effective therapies.

Alternatively, ME/CFS research could take advantage of population-scale bioresources, such as the UK Biobank, which holds diverse genetic, molecular, cellular, imaging and questionnaire data on nearly 500,000 individuals. This has the added advantage of being cheaper than recruiting a new cohort, and generating new data.

For ME/CFS, this approach raises the difficulty of how to best categorise people with this disease, for example by questionnaire response or through linked electronic health records. In the UK Biobank, there are four ways to categorise people with ME/CFS.

This study’s results show that just over 1% of the UK Biobank participants could be categorised as having a ME/CFS medical diagnosis. However, not all evidence for their ME/CFS diagnosis is consistent.

By cross-referencing different data in UK Biobank, we show that a participant’s ME/CFS diagnosis is best supported by two or more lines of evidence. Using the most consistent evidence, and accounting for those who – due to their illness – could not participate in the UK Biobank, we estimate that the UK’s prevalence of ME/CFS is 410,000.

Keywords

Myalgic Encephalomyelitis/Chronic Fatigue Syndrome prevalence; Population bioresource; Electronic health records; Post-viral fatigue syndrome

Introduction

Pathomechanisms of myalgic encephalomyelitis (ME; also known as chronic fatigue syndrome, CFS) are unknown. ME/CFS is a highly disabling, multi-system disorder affecting about 65 million people worldwide1, although this is rising steeply due to symptoms of many people with Long Covid meeting ME/CFS diagnostic criteria2. Approximately two-thirds of ME/CFS cases report an infection at onset of their symptoms35. The hallmark symptom of ME/CFS is post-exertional malaise, when new symptoms arise, or previous ones worsen, following even minimal exertion that are disproportionate to the level of exertion, prolonged in recovery and not immediately alleviated by rest6. Additional symptoms are fatigue that does not go away with rest, pain including headache, cognitive dysfunction and multiple sensitivities7. ME/CFS is a female-biased disease with about five-times more women affected than men7. The annual economic impact of ME/CFS is $36–51 billion in the USA alone8.

Among patients’ and clinicians’ priorities for future ME/CFS research is the identification of ME/CFS diagnostic biomarkers and risk factors9. Little progress on this has been made, with biomarker and genetic studies yielding results that are mostly not replicated subsequently10,11. One recent strategy for investigating ME/CFS pathomechanisms involved an in-depth analysis of 17 ME/CFS cases, a project costing $8 million12. An alterative approach is to develop national cohorts of well-phenotyped and genotyped ME/CFS cases, such as DecodeME in the UK, a study costing £3.2 million13. A quicker and cheaper approach, however, is to analyse pre-existing data from national biobanks that have recruited hundreds-of-thousands of participants14. For this, data access costs can be modest, for example £9,000 for 3-year access to UK Biobank (UKB) data on approximately 500,000 people.

UKB’s open access strategy for health-related data has already yielded an academic dividend of 10,000 scientific papers, with many new discoveries being made annually15,16. Its diverse data include whole-genome sequences, brain, heart and full-body magnetic resonance imaging, blood biochemistry markers, activity monitors and online questionnaires17. UKB data was acquired using standard protocols, and has been subjected to detailed quality control procedures. Disadvantages of UKB are its healthy participation and sociodemographic biases18.

UKB has been used to identify ME/CFS genetic risk factors, using data from approximately 2,000–2,500 ME/CFS participants. However, these studies revealed few genome-wide significant findings and these failed to replicate across analyses11,1922. This lack of discovery could be explained by low predictive power from too few cases, and/or phenotypic misclassification arising from an overly permissive case definition.

As we show below, 5,354 individuals in the UKB are linked to 1 or more pieces of evidence of ME/CFS. Future UKB studies may thus wish to draw upon data from these additional individuals, and to refine ME/CFS case definition, in order to identify genetic, molecular, imaging, or activity data separating ME/CFS cases from population controls.

Here, our aim was to aid the definition of ME/CFS caseness in UKB analyses by investigating the consistency, or otherwise, of UKB participants’ ME/CFS status, defined in four different ways. We evaluate each of the 4 cohorts in turn, in relation to their concordance or discordance with the 3 other cohorts, before interpreting which of their cohort combinations might provide most scientific value in future studies.

Methods

Public and Patient Involvement

Public consultations on biobanking began in 2000, before funding for UK Biobank was obtained, and have continued since. UK Biobank actively engages with its participants through questionnaires, annual newsletters, follow-up studies and participation in other research activities. UK Biobank employed focus groups and a lay Advisory Panel/Community Advisory Group to advise them on study design and recruitment. No patients/participants – other than those in UK Biobank – were directly involved in this study.

Data acquisition

Data was downloaded onto the UK Biobank’s Research Analyst Platform (UKB-RAP) on 23rd March 2023. This project was refreshed on 31st August 2023 and all analyses were performed on 20th February 2024, except for the analysis of the Experience of Pain questionnaire symptom/comorbidity responses which was conducted between 22nd–29th April 2024 when cohort numbers remained unaltered from the previous analyses. UK Biobank field identifiers are provided in Extended Data.

Baseline measurements

On the same day, at their visit to the UKB assessment centre, yet prior to the verbal interview, participants completed a touchscreen questionnaire featuring a question about their overall health: “In general how would you rate your overall health?”. Participants were able to choose the following response options: ‘Poor’, ‘Fair’, ‘Good’, ‘Excellent’, ‘Do not know’, and ‘Prefer not to Answer’.

In a touchscreen questionnaire, participants were asked “Have you been told by a doctor that you have other serious illnesses or disabilities?”23. Responses to these questions were provided to the interviewer, a nurse. Participants were then asked “You selected that you have been told by a doctor that you have other serious illnesses or disabilities, could you tell me what they are?”24. One such response option was ‘Chronic fatigue syndrome’; neither ME nor ME/CFS responses were available.

Pain Questionnaire (PQ)

In 2019–20, 167,184 UKB participants (94,988 females, 72,196 males) who completed the PQ, answered the question “Have you ever been told by a doctor that you have ME/CFS?”25 Here, we chose 15 additional PQ questions that are relevant to ME/CFS symptomatology. Questions 1–12 had no missing data. These were 6 questions with possible responses ‘yes’ (considered as ‘affected’), or ‘no’, ‘do not know’, ‘prefer not to answer’ (assumed to be ‘unaffected’ for this analysis): (1) “Have you ever been told by a doctor you have had Fibromyalgia syndrome?” (or “Ever had fibromyalgia”); (2) “Have you ever been told by a doctor that you have had migraine?” (“Ever had migraine”); (3) “Are you troubled by pain or discomfort, either all the time or on and off, that has been present for more than 3 months?” (“Troubled by pain or discomfort for >3 months?”); (4) “During the past 6 months have you had headaches?” (“Headache in past six months”); (5) “Do you have persistent or recurrent tiredness, weariness or fatigue that has lasted for at least 6 months?” (“Persistent fatigue for at least 6 months?”); and, (6) “Have you suffered from fatigue or exhaustion in the last week?” (“Fatigue or exhaustion in last week?”).

The next 6 questions were on symptom severity with possible responses ‘mild’, ‘moderate’ or ‘severe’ (considered as ‘affected’), or ‘no problem’ or ‘prefer not to answer’ (considered as ‘unaffected’); or on symptom frequency with possible responses ‘several days’, ‘more than half the days’ and ‘nearly every day’ (considered ‘affected’), and ‘not at all’ or ‘prefer not to say’ (considered ‘unaffected’): (7) “Indicate the level of severity over the past week of waking unrefreshed” (“Waking unrefreshed severity over the past week”); (8) “Indicate the level of severity over the past week of cognitive symptoms. For example, problems with memory, thinking, skills and/or concentration” (“Cognitive symptoms severity over the past week”); (9) “Please click the ONE box that best described your health TODAY: Mobility” (“Mobility problems today”); (10) “Please click the ONE box that best describes your health TODAY: Self-care” (“Self-care problems today”); (11) “Please click the ONE box that best describes your health TODAY: Usual activities” (“Problems doing usual activities”); and, (12) “Over the last 2 weeks, how often have you been bothered by feeling tired or having little energy?” (“Recent feelings of tiredness or low energy”).

The final 3 questions are related to post-exertional malaise, with possible responses: ‘yes’ (considered ‘affected’), ‘no’, ‘do not know’ or ‘prefer not to answer’ (all considered ‘unaffected’). These questions were only asked to participants who answered ‘yes’ to question (5) above: ‘Do you have persistent or recurrent tiredness, weariness or fatigue that has lasted for at least 6 months?’ including to 1,876 (69%) of those in Cohort 2. These 3 questions were: (13) “Do you get tired after minimal physical or mental exertion?” (“Tired after minimal physical or mental exertion”); (14) “Does this tiredness, weariness or fatigue go away when you rest?” (“Fatigue goes away when resting”); and, “Is this tiredness, weariness or fatigue happening only because you have been exercising and/or working too much?” (“Fatigue only because of exercising or working too much”).

UKB was given access to hospital inpatient records26 for 446,974 individuals in their cohort (88%; 244,747 females, 202,227 males). Of 1,407 with a G93.3 ICD-10 code, few (6%; 87) had G93.3 listed as the main reason for admission with nearly all (97%; 1,361) having G93.3 listed under a secondary code; 41 (3%) have G93.3 listed as both a main and secondary code. Also, the UKB was granted access to primary care records27 for 230,055 of its 502,364 participants (46%). Primary care codes are split into CTV3 codes and ReadV2 codes depending on the GP computer system supplier (UK Biobank 2019). In total, 1,818 individuals in the UKB have been linked to both CTV3 and ReadV2 codes (1,021 females, 797 males); these duplicate individuals are only counted once. ME/CFS referral codes were not included because referrals may not necessarily lead to a diagnosis.

Results

The 4 ME/CFS cohorts investigated were:

Cohort 1 (C1): ‘Self-reported CFS’ – those who self-reported a clinical diagnosis of CFS in the verbal interview at a visit to the UKB assessment centre.

Cohort 2 (C2): ‘ME/CFS (PQ)’ – those who self-reported clinical diagnosis of ME/CFS in the ‘Experience of Pain’ questionnaire (PQ25; 2019–20).

Cohort 3 (C3): ‘ICD-10:G93.3’ – those with this International Classification of Diseases tenth revision (ICD-10) ‘Post-viral fatigue syndrome’ (PVFS) code in hospital inpatient records. Notably, post-exertional malaise is not required for individuals to receive this code. Further, people with ME/CFS without an infection prior to disease onset will not be given this code. Consequently, PVFS is similar to but not equivalent to ME.

Cohort 4 (C4): ‘GP code for ME/CFS’ – those with a ME/CFS code in their primary care (General Practice) records.

For UKB data fields used in these analyses, see Extended Data.

Lack of overlap between cohorts could be due to inconsistencies in evidence of an ME/CFS diagnosis. It may also be due to missing data: only subsets of UKB participants (i) completed the PQ, (ii) have undergone a hospital admission (and consequently have hospital inpatient data) and/or (iii) have available primary care records. Inconsistencies may also be due to asynchrony between data sources: (i) an ME/CFS diagnosis might not have been captured in ICD-10 codes as these date back to 1981–1998 depending on health service provider, and diagnosis may have preceded this date; (ii) recent ME/CFS diagnoses might not be present in primary care codes due to these being extracted for UKB in 2016–2017; and, (iii) a participant’s EHR may record a ME/CFS diagnosis despite them not previously self-reporting ME/CFS, when their diagnosis postdates their last self-reporting opportunity.

We also investigated apparent inconsistencies between evidence of self-reported CFS diagnosis and each participant’s self-reported overall health at entry into UKB. Our reasoning is that those with evidence of a diagnosis and yet who report ‘good’ or ‘excellent’ overall health might have misreported or been miscoded as self-reported CFS, or were recovered, misdiagnosed or in remission.

Cohort 1: Self-reported CFS diagnosis

A total of 2,312 (0.46%) of 502,364 UKB participants (273,297 female, 229,067 male), self-reported a clinical diagnosis of CFS in a verbal interview at one or more visit to the UKB assessment centre. This cohort (C1) has 1,685 females (0.62%), and 627 males (0.27%), a female bias of 2.29:1. Surprisingly, 44%-69% of these failed to re-report CFS at subsequent visits to the UKB. As expected, those who self-reported a CFS diagnosis are significantly less likely to report ‘excellent’ or ‘good’ overall health, and significantly more likely to report ‘poor’ or ‘fair’ overall health, when compared to those who did not self-report a diagnosis of CFS (p<10-5; Table 1). Most of C1 reported ’poor’ or ‘fair’ overall health. Unexpectedly, however, 28% reported ‘good’ or ‘excellent’ overall health when they first volunteered their CFS diagnosis (Table 1).

Table 1. Self-reported overall health ratings in C1.

Numbers and percentages of those who self-reported CFS at visits to the UKB assessment centre, stratified by gender (top). Percentages of those who did not self-report CFS at their first visit to the UK Biobank centre are also shown (bottom). Overall health ratings are as reported at the time an individual first reported CFS. Ambiguous responses ‘Do not know’ and ‘Prefer not to answer’ (n=22) are excluded. A small number of UKB participants failed to complete this question.

Overall Health RatingBoth sexes (n=2,289)Female (n=1,667)Male (n=622)
Self-Reported
CFS (at any
visit)
Poor771 (33%)540 (32%)231 (36%)
Fair857 (37%)646 (38%)211 (33%)
Good590 (25%)432 (25%)158 (25%)
Excellent71 (3%)49 (3%)22 (4%)
Did not Self-
Report CFS at
Visit 1
Poor22,040 (4%)10,249 (4%)11,791 (5%)
Fair104,524 (21%)52,286 (19%)52,238 (23%)
Good288,402 (58%)161,111 (59%)127,291 (56%)
Excellent81,770 (16%)46,266 (17%)35,504 (16%)

C1 has variable evidence of ME/CFS from other sources, when available. Among those who completed the PQ, most (89%, n=850) provided a concordant response, answering ‘yes’ to having a clinical diagnosis of ME/CFS. However, only one-third of C1 with hospital inpatient records linked in UKB had an ICD-10:G93.3 code: 714 of 2,148 (33%; 531 females, 183 males). Further, only half (55%; n=617) of C1 have a ME/CFS code in their UKB primary care records: 431 females and 186 males. In all, over one-third (36%) of C1 have no further evidence of ME/CFS (602 females, 220 males), although this is in part due to data missingness. Thirty-nine percent of C1 (894 people) have 1 further piece of evidence, 22% (501) have 2, and 4% (95) have all 3.

Cohort 2: ME/CFS defined in the Pain Questionnaire

Among 167,184 participants who completed the PQ, 2,720 (1.63%) are in cohort 2 (C2), having answered ‘Yes’ to “Have you ever been told by a doctor that you have ME/CFS?” 1,929 of 94,988 were female (2.0%) and 791 of 72,196 (1.1%) were male, a female-bias of 1.85:1. Among those saying ‘Yes’ are 850 who are also present in C1 (see above) leaving 1,870 cases newly-reported in C2 (1,299 female, 571 male).

Among those in C2, UKB has access to hospital inpatient data for 2,527 (92%) and primary care code records for 1,315 (48%). Of those with available data, only 459 (18%) have an ICD-10:G93.3 code and 441 (34%) have a ME/CFS code in their primary care records. Just over half (57%, n=1,556) of C2 have no further evidence of ME/CFS (i.e., are not in C1, C3 or C4), in part due to missingness in the data. A quarter (673; 25%) have 1 further piece of evidence, 15% (396) have 2 further pieces of evidence, and 3% (95) have all 3.

Cohort C2 were significantly more likely than those in neither C1 nor C2 (‘controls’; C1CC2C) to have: been diagnosed with fibromyalgia (16-fold higher), been diagnosed with migraine (2.0-fold higher), been troubled by pain or discomfort for >3 months (1.4-fold), experienced headaches in the past 6-months (1.7-fold), experienced persistent or recurrent tiredness, weariness or fatigue that has lasted over 6-months (3.5-fold), and experienced fatigue or exhaustion in the past week (2.3-fold); all p<10-5 (Figure 1A). ME/CFS cohort C2 was also significantly more likely than controls to have experienced mild, moderate or severe problems with: unrefreshing sleep in the past week (1.6-fold higher), cognitive difficulties in the past week (1.9-fold), mobility problems today (1.9-fold), self-care problems today (3.0-fold), problems with usual activities today (2.0-fold), and frequent problems with tiredness or little energy in the last 2 weeks (1.7-fold; all p<10-5) (Figure 1B).

56410458-0041-411d-977d-1c13c0699eda_figure1.gif

Figure 1. ‘Yes’ responses to ME/CFS-relevant symptom and comorbidity questions as asked in the PQ.

Higher proportions of 6 comorbidities (A) and 6 symptoms (B) in a ME/CFS cohort than among controls. Cohort 2 (C2, ME/CFS, PQ; blue) versus those answering ‘No’ in the PQ and who never self-reported CFS in the verbal interviews (‘controls’, C1CC2C; orange). Note that 2,066 individuals who answered 'I don't know' and 19 who answered 'Prefer not to answer' to the ME/CFS question in the PQ are not included among controls due to their ambiguous response. (A) C2 individuals were significantly more likely than controls to have been diagnosed with fibromyalgia (21.8% versus 1.3%); migraine (37.5% versus 18.9%); to have been troubled by pain or discomfort for >3 months (77.2% versus 55.5%); experienced headaches in the past 6 months (56.4% versus 33.2%); experienced persistent or recurrent tiredness, weariness or fatigue that has lasted over 6 months (68.9% versus 19.9%); and experienced fatigue or exhaustion in the past week (73.9% versus 32.0%); all p<10-52-test). (B) C2 individuals were significantly more likely than controls to have experienced mild, moderate or severe problems with: unrefreshing sleep in the past week (86.0% versus 54.3%), cognitive difficulties in the past week (69.2% versus 36.4%), mobility problems today (52.2% versus 28.2%), self-care problems today (25.4% versus 8.5%), problems with usual activities today (61.3% versus 31.2%), and were more likely to experience feeling tired or having little energy in the past 2 weeks (86.1% versus 50.5%) (all p<10-5; χ2-test).

Among cohort C2 were 1,876 (69.0%) who answered ‘yes’ to ‘Do you have persistent or recurrent tiredness, weariness or fatigue that has lasted for at least 6 months?’ These UKB participants then answered 3 questions that are relevant to post-exertional malaise. They were, as expected, significantly more likely than those in (C1CC2C) controls to report getting tired after minimal physical or mental exertion (1.6-fold higher), and significantly less likely to report that their tiredness goes away when resting (0.53-fold lower), or that their tiredness was only due to over-exertion (0.42-fold lower; all p<10-5; Figure 2). Just under two-thirds (394 of 617; 64%) of those in both C1 and C2 answered these 3 questions in a manner consistent with post-exertional malaise (i.e., ‘Yes’, ‘No’, ‘No’; Figure 2), higher than the 57% (1,073 of 1,876) in C2 who answered these in the same way. These questions are thus valid for further refining ME/CFS caseness.

56410458-0041-411d-977d-1c13c0699eda_figure2.gif

Figure 2. ‘Yes’ responses as a percentage of each cohort for post-exertional malaise-related questions.

Questions were asked in the UK Biobank Experience of Pain Questionnaire (PQ). The ME/CFS (PQ) cohort C2 (dark blue) was significantly more likely than C1CC2C controls (orange) to report becoming tired after minimal physical or mental exertion (76.1% versus 48.2%, respectively), significantly less likely to report that their tiredness goes away when resting (25.2% versus 47.6%, respectively), and significantly less likely to report that their tiredness is only due to over-exertion (6.9% versus 16.6%, respectively); all p<10-5. Individuals in both cohorts 1+2 (concordant cohort: C1 and C2; C1C2; grey) more often reported feeling tired after minimal physical or mental exertion, than those in cohort 2 but not 1 (C2C1C; yellow; 80.6% versus 74.0%, respectively) or in cohort 1 but not 2 (C1C2C; light blue; 80.6% versus 65.0%, respectively). Similarly, those in C1C2 (grey) were less likely than those in (C2C1C), or (C1C2C) (20.4% versus 27.5% or 31.8%, respectively) to report that their tiredness went away when resting. The discordant cohort (C1 not C2: C1C2C; light blue; i.e., self-reported CFS in verbal interview but answered ‘No’ or 'do not know' or 'prefer not to answer' in the PQ) was significantly more likely to report that their tiredness was due only to over-exertion than the concordant cohort (C1C2; grey): 12.7% versus 5.8%, respectively. Asterisks denote significant differences between cohorts (χ2 test; p<10-5 ***; p<0.01 **; p<0.05 *; n/s, not significant at p<0.05).

Cohort 3: Hospital Inpatient Data: ICD-10 G93.3

In total, 446,974 individuals (244,747 female, 202,227 male) in the UKB underwent 1 or more hospital admission and so have linked hospital inpatient codes. Of these, 1,407 (0.31%) have an ICD-10 code for ‘Post-viral fatigue syndrome’ (G93.3); 1,048 are female (0.43%) and 359 male (0.18%), giving a female-to-male ratio of 2.4:1.

Among those in cohort 3 (C3), 51% (714) are also in C1 (‘self-reported CFS’); 88% (459) are in C2 (Pain Questionnaire), and 60% (412) have a Primary Care code for ME/CFS (i.e., are in cohort C4; see below), when data is available. Altogether, among 1,407 in C3 (with an ICD-10:G93.3 code), 398 (28%) lack further evidence of ME/CFS, 528 (38%) have 1 further piece of evidence, 386 (27%) have 2 further pieces of evidence and 95 (7%) have 3 further pieces of evidence.

Cohort 4: Primary care records

Among all UKB participants, 230,055 have linked primary care records of whom 1,575 (0.68%; cohort C4) have a ME/CFS code: 1,071 have a CTV3 code, 505 have a Readv2 code and 1 has both types of code. C4 contains 1,109 females (0.88%) and 466 males (0.45%), giving a female-to-male ratio of 1.96:1.

The clinical diagnoses of people in C4 allowed us to assess the diagnostic validity of C1-3. We defined concordances between cohorts using only those individuals for whom all relevant data was available. Concordance with C4 was highest for C2 (71%), followed by C1 (39%), and lowest for C3 (28%). Despite their known clinical diagnosis in primary care, 61%, 29% and 72% of UKB participants in C1, C2 and C3, respectively, thus failed to self-report CFS in the verbal interview, failed to self-report ME/CFS in the PQ or failed to have linked hospital inpatient data, respectively. Among 958 individuals in C4 who did not self-report CFS in the verbal interview, 32% (302) have additional evidence of ME/CFS from the PQ or ICD-10:G93.3 code.

In all, 656 (42%) of those in C4 have no further evidence of ME/CFS, 463 (29%) have 1 further piece of evidence, 361 (23%) have 2 further pieces of evidence and 95 (6%) have 3 further pieces of evidence.

ME/CFS status in 4 cohorts

The prevalence of ME/CFS in C1-4 varied between 0.31% and 1.63% for both sexes combined, 0.18%-1.1% for males, and 0.43%-2.0% for females (Table 2). In the UKB, there are 5,354 individuals with 1 or more pieces of evidence for ME/CFS: 3,432 (64%) have 1 piece of evidence, 1,279 (24%) have 2 pieces of evidence, 548 (10%) have 3 pieces of evidence and 95 (2%) have all 4 pieces of evidence (Table 3).

Table 2. Prevalence and female-to-male (F/M) ratio of ME/CFS across different sources of ME/CFS data in UKB.

UKB CohortSubsetPrevalenceF/M ratio
C1: Self-report CFS Both sexes0.46%2.29:1
Female0.62%
Male0.27%
C2: ME/CFS Pain
Questionnaire
Both sexes1.63%1.85:1
Female2.03%
Male1.10%
C3: ICD-10: G93.3
code
Both sexes0.31%2.38:1
Female0.43%
Male0.18%
C4: GP codeBoth sexes0.68%1.96:1
Female0.88%
Male0.45%

Table 3. Numbers of UK participants with 1, 2, 3 or 4 pieces of evidence for ME/CFS.

Among 5,354 individuals in the UKB with at least 1 piece of ME/CFS evidence: 3,432 have 1 piece of evidence, 1,279 have 2 pieces of evidence, 548 have 3 pieces of evidence and 95 have all 4 pieces of evidence. Each of the 5,354 participants is counted once only in this table.

Source of Datan
1 piece of ME/CFS evidence:
Self-reported CFS (C1) only822
ME/CFS in PQ (C2) only1,556
ICD-10:G93.3 (C3) only398
Primary Care code for ME/CFS (C4) only656
2 pieces of ME/CFS evidence:
Self-reported CFS & ME/CFS in PQ only (C1 & C2 only)406
Self-reported CFS & ICD-10:G93.3 only (C1 & C3 only)280
Self-reported CFS & Primary Care code for ME/CFS only (C1 & C4 only)208
ME/CFS in PQ & ICD-10:G93.3 only (C2 & C3 only)130
ME/CFS in PQ & Primary Care code for ME/CFS only (C2 & C4 only)137
ICD-10:G93.3 & Primary Care code for ME/CFS only (C3 & C4 only)118
3 pieces of evidence:
Self-reported CFS & ME/CFS in PQ & ICD-10:G93.3 only (C1 & C2 & C3 only)187
Self-reported CFS & ICD-10:G93.3 & Primary Care code for ME/CFS only (C1 & C3 & C4 only)152
ME/CFS in PQ & ICD-10:G93.3 & Primary Care code for ME/CFS only (C2 & C3 & C4 only)47
Self-reported CFS & Primary care code for ME/CFS & ME/CFS in PQ only (C1 & C2 & C4 only)162
4 pieces of ME/CFS evidence:
Self-reported CFS & ME/CFS in PQ & ICD-10:G93.3 & Primary Care code for ME/CFS (C1& C2 & C3 & C4)95

Discussion

Of 5,354 individuals in the UKB with evidence of a ME/CFS diagnosis, two-thirds (3,432; 64%) have only 1 piece of evidence. Also, each of the four pieces of evidence for ME/CFS is not strongly supported by any other (Figure 3). The exceptions are the ME/CFS-PQ cohort (C2), whose evidence supports membership of each of the 3 other cohorts at >70%, when complete data is available, and the ICD10:G93.3 cohort (C3) of whom 72% have supporting evidence from other data sources. Conversely, membership of each of C1, C3 or C4 poorly supports C2 membership. Here we speculate why there are concordances and discordances among the 4 cohorts including with respect to likely biases in self-reported and electronic health record data.

56410458-0041-411d-977d-1c13c0699eda_figure3.gif

Figure 3. Concordance of ME/CFS evidence across data sources for each of the 4 UKB cohorts.

This considers only individuals with available data (i.e., no missingness). Each bar indicates the percentage of the cohort (e.g. C1 – left) who are also members of a second cohort (e.g. C2 – light blue), accounting for missing data (i.e., only considering those with linked data required for specifying C1 and C2 membership). A total of 850 are members of both C1 and C2 cohorts. Among those for whom data was available, this is 31% of C1 members, and 89% of all C2 members, a discrepancy that could reflect participation bias in completion of the Pain Questionnaire and/or that its completion was approximately 10 years after the baseline questionnaire. Unlike other values, the ‘No further data’ values (dark blue) do not account for data missingness. Significance testing was not employed due to data overlap and missingness between cohorts.

Membership of cohorts’ C1-4 will be inflated when ‘Chronic fatigue’ has been wrongly conflated with ‘Chronic fatigue syndrome’ by clinicians or UKB participants and nurses. ‘Chronic Fatigue’ occurs at much higher prevalence than ‘Chronic Fatigue Syndrome’28,29 yet, importantly, does not require hallmark ME/CFS symptoms such as post-exertional malaise. This conflation may explain why, in one study, only about one-third of GP-diagnosed ME/CFS cases met a more stringent ME/CFS case definition30. Membership will also be inflated, relative to the general UK population, due to UKB’s sociodemographic biases including age, ethnicity and gender16,18 which all favour increased rates of ME/CFS diagnosis31. Cohort 2 membership was also inflated due to participation bias: those in C1 were significantly more likely to complete the subsequent optional online PQ than others (41% versus 33%, respectively; p<10-5).

Cohort C1 (n=2,312), without further filtering, does not have strong validity for ME/CFS research: it has unexpectedly high levels of individuals reporting ‘good’ or ‘excellent’ overall health (28%), many (44%–69%) fail to re-report CFS at subsequent visits to the UKB, 11% (n=105) who also completed the PQ provided a discordant response when asked about a clinical ME/CFS diagnosis, and there were high levels of discordance with hospital inpatient and primary care data. It is thus plausible that some within this cohort have mis-reported a diagnosis of CFS, been self-diagnosed (rather than receiving a clinical diagnosis), been diagnosed with ‘Chronic Fatigue’ rather than ME/CFS, been misdiagnosed with ME/CFS previously and since received an alternative appropriate diagnosis, or CFS has been erroneously recorded during the verbal interview. Previous analyses that used C1 are thus called into question. For example, genetic studies of ME/CFS, which to be well-powered require a homogeneous population, have not yielded consistent findings using this cohort20,3236. These studies also used up to 3,042 UKB participants as controls who nevertheless have evidence of ME/CFS (i.e., those in C2–4, but not C1). These individuals would have been included as controls in previous ME/CFS analyses, likely weakening their power to detect true associations. We recommend filtering C1 cases by removing (i) those with ‘good’ or ‘excellent’ overall health ratings, and (ii) those who provided a discordant response in the PQ. This strategy was employed in a separate analysis37 which resulted in a larger yield of ME/CFS-associated blood traits.

Cohort C2 (n=2,720) might be considered to have strong validity for ME/CFS research because it is defined by the explicit question “Have you ever been told by a doctor that you have ME/CFS?” Also, those in C2 reported levels of fatigue, muscle pain, headaches, unrefreshed sleep and cognitive difficulties, and a comorbid fibromyalgia diagnosis, similar to those reported in a separate large-scale UK study covering adults of all ages4. Nevertheless, for two reasons we caution against C2 being used for ME/CFS research without further refinement. First, nearly one-third (844; 31.0%) of C2 did not report ‘persistent or recurrent tiredness, weariness or fatigue that has lasted over 6 months’ (p120114, Extended Data), which is a core symptom for ME/CFS diagnosis6. Second, not everyone in C2 appears to experience post-exertional malaise: 448 (24%) of this cohort do not report ‘feeling tired after minimal physical or mental exertion’ (p120117, Extended Data), 472 (25%) report that their ‘tiredness goes away when resting’ (p120115, Extended Data) and 130 (6.9%) report their ‘tiredness is only due to over-exertion’ (p120116, Extended Data). Overall, less than half (43%) of those in C2 are also in one or more of the other 3 cohorts. Thus, one possible strategy for filtering C2 is to only consider those who provided responses consistent with post-exertional malaise: namely the 1,073 individuals who answered ‘yes’, ‘no’, ‘no’ to p120117, p120115, p120116, respectively. Nevertheless, these three questions may not effectively define post-exertional malaise because they only capture ‘tiredness’, as opposed to a flare up of ‘all symptoms’, they overlook ‘new symptoms’ which are characteristic of post-exertional malaise6, and they do not account for people with ME/CFS pacing (e.g., ‘resting’ and not ‘over-exerting’). Otherwise, a more permissive approach would be to include only those who answered ‘yes’ to p120117 (n=1,428).

The largest intersection between cohorts is between C1 and C2 (n=850; Table 3). These individuals are slightly more likely to report ME/CFS symptoms, relative to those in C2 yet who did not self-report CFS at baseline (i.e., not in C1): they are significantly more likely to be tired after minimal physical or mental exertion (81% vs 74%), and also less likely that their tiredness goes away after rest (20% vs 27%; Figure 2). Further requiring participants to have an ICD-10:G93.3 or ME/CFS primary care code reduces their number considerably, to 187 or 162, respectively. Whilst this provides cohorts with support from health record data, their numbers dwindle to low case numbers (approximately 20–40) when further intersected with UKB participants with plasma protein, or brain image or accelerometry data, for example3840, which would substantially weaken statistical power in such studies. Mandating participants to have at least 1 piece of self-report data (C1 or C2) and at least 1 piece of EHR data (C3 or C4) boosts case number to 1,398, a quarter (26%) of all UKB ME/CFS (C1–4) cases.

Cohort C3 has most support from other cohorts (72%). Nevertheless, using C3 without further filtering is not advised as some may have ‘Post viral Fatigue Syndrome’, which does not require post-exertional malaise, rather than ME/CFS which does. Thus, we recommend restricting C3 to the 1,009 individuals who also fall within one of the other cohorts: C1, C2 or C4. A further caveat to consider is that C3 will likely not include the one-third of people with ME/CFS who do not have an infection prior to the onset of their symptoms7. It is also the smallest cohort: 26% of all individuals in C1–4 (Table 2); this fraction reduces to 19% when the recommended filters are applied.

As a clinically-defined cohort, C4 has strong validity and numbers are relatively high (n=1,575). Nevertheless, we recommend removing those with discordant PQ data, specifically ~29% of participants (n=177) who completed the PQ yet failed to answer ‘Yes’ to the question: “Have you ever been told by a doctor you have ME/CFS?” Doing so, reduces this cohort to about 1,120 (of 230,055 with GP records), or a UKB prevalence of 0.49%. Extrapolating this percentage to the UK population41 predicts 330,000 people with a life-time risk of an ME/CFS diagnosis. This is lower than an estimate of 404,000 that assumed equal access to diagnosis, which is far from being achieved31.

We recommend that control cohorts exclude C1-4 members and remove those providing an ambiguous response in the PQ or who have ME/CFS referral codes in their primary care records.

This analysis benefits from the diversity of data types being compared for the same population-scale cohort, UKB, rather than for other biobanks, such as FinnGen and the Estonian Biobank, which link to ME via ICD-10 codes only42,43. The UKB ME cohort is the largest globally. The All of Us CFS cohort44 might appear to be larger (at 8,600 participants) but 93% of this cohort are linked to chronic fatigue, unspecified (R53.82 ICD-10CM) rather than to ME or PVFS. This study’s weaknesses are that missing UKB data is assumed to be missing completely at random, and that the healthy participation bias inherent in the UKB18 will result in disproportionately few people participating in UK who were severely- or even moderately-affected by ME/CFS. Accounting for the estimated one-quarter of people with ME/CFS who had no opportunity to participate in UKB due to being bed- or house-bound45, our estimated ME/CFS prevalence in the UK then rises from 330,000 to 410,000 (0.6%).

Ethical approval

UK Biobank has approval from the North West Multi-centre Research Ethics Committee (MREC) as a Research Tissue Bank (RTB) approval (2011, renewed 2016 and 2021). This approval means that researchers do not require separate ethical clearance and can operate under the RTB approval.

Consent

The basis for consent in UK Biobank rests on participants' explicit and informed consent. UK Biobank uses “legitimate interests” as the primary lawful basis on which to process personal data under the UK GDPR.

Comments on this article Comments (0)

Version 1
VERSION 1 PUBLISHED 28 Apr 2025
Comment
Author details Author details
Competing interests
Grant information
Copyright
Download
 
Export To
metrics
VIEWS
493
 
downloads
66
Citations
CITE
how to cite this article
Samms GL and Ponting CP. Defining a High-Quality Myalgic Encephalomyelitis/Chronic Fatigue Syndrome cohort in UK Biobank [version 1; peer review: 2 approved]. NIHR Open Res 2025, 5:39 (https://doi.org/10.3310/nihropenres.13956.1)
NOTE: If applicable, it is important to ensure the information in square brackets after the title is included in all citations of this article.
track
receive updates on this article
Track an article to receive email alerts on any updates to this article.

Open Peer Review

Current Reviewer Status: ?
Key to Reviewer Statuses VIEW
ApprovedThe paper is scientifically sound in its current form and only minor, if any, improvements are suggested
Approved with reservations A number of small changes, sometimes more significant revisions are required to address specific details and improve the papers academic merit.
Not approvedFundamental flaws in the paper seriously undermine the findings and conclusions
Version 1
VERSION 1
PUBLISHED 28 Apr 2025
Views
8
Cite
Reviewer Report 29 May 2025
Matthias Wielscher, Medical University of Vienna, Vienna, Austria 
Leonardo Vincenzi, Department of Dermatology, Medical University of Vienna, Vienna, Vienna, Austria 
Approved
VIEWS 8
The study by Samms and Ponting presents a comprehensive approach to defining ME/CFS cohorts within the UK Biobank using a combination of self-reported questionnaire responses and linked healthcare records. They delineate four cohorts: C1, based on self-reported ME/CFS at baseline; ... Continue reading
CITE
CITE
HOW TO CITE THIS REPORT
Wielscher M and Vincenzi L. Reviewer Report For: Defining a High-Quality Myalgic Encephalomyelitis/Chronic Fatigue Syndrome cohort in UK Biobank [version 1; peer review: 2 approved]. NIHR Open Res 2025, 5:39 (https://doi.org/10.3310/nihropenres.15168.r35434)
NOTE: it is important to ensure the information in square brackets after the title is included in all citations of this article.
Views
2
Cite
Reviewer Report 28 May 2025
Joanna Elson, Newcastle University, Newcastle upon Tyne, England, UK 
Approved
VIEWS 2
The paper considers the progress in Myalgic Encephalomyelitis/Chronic Fatigue Syndrome (ME/CFS) research which has been frustratingly slow. This is due to a number of factors one being relatively small-scale studies are common in the field and the results of these ... Continue reading
CITE
CITE
HOW TO CITE THIS REPORT
Elson J. Reviewer Report For: Defining a High-Quality Myalgic Encephalomyelitis/Chronic Fatigue Syndrome cohort in UK Biobank [version 1; peer review: 2 approved]. NIHR Open Res 2025, 5:39 (https://doi.org/10.3310/nihropenres.15168.r35428)
NOTE: it is important to ensure the information in square brackets after the title is included in all citations of this article.

Comments on this article Comments (0)

Version 1
VERSION 1 PUBLISHED 28 Apr 2025
Comment
Alongside their report, reviewers assign a status to the article:
Approved - the paper is scientifically sound in its current form and only minor, if any, improvements are suggested
Approved with reservations - A number of small changes, sometimes more significant revisions are required to address specific details and improve the papers academic merit.
Not approved - fundamental flaws in the paper seriously undermine the findings and conclusions

Are you an NIHR-funded researcher?

If you are a previous or current NIHR award holder, sign up for information about developments, publishing and publications from NIHR Open Research.

You must provide your first name
You must provide your last name
You must provide a valid email address
You must provide an institution.

Thank you!

We'll keep you updated on any major new updates to NIHR Open Research

Sign In
If you've forgotten your password, please enter your email address below and we'll send you instructions on how to reset your password.

The email address should be the one you originally registered with F1000.

Email address not valid, please try again

You registered with F1000 via Google, so we cannot reset your password.

To sign in, please click here.

If you still need help with your Google account password, please click here.

You registered with F1000 via Facebook, so we cannot reset your password.

To sign in, please click here.

If you still need help with your Facebook account password, please click here.

Code not correct, please try again
Email us for further assistance.
Server error, please try again.