Detecting and responding to fake responses to an online infant care survey in the UK

Alice-Amber Keegan; Becky Lambert; Jenny Ingram; Peter S. Blair; Peter J. Fleming; Anna Pease

doi:10.3310/nihropenres.14281.1

Home Browse Detecting and responding to fake responses to an online infant care...

ALL Metrics

-

Views

5

Downloads

Get PDF

Get XML

Export

▬

✚

Method Article

Detecting and responding to fake responses to an online infant care survey in the UK

[version 1; peer review: awaiting peer review]

Alice-Amber Keegan¹, Becky Lambert¹, Jenny Ingram¹, Peter S. Blair¹, Peter J. Fleming¹, Anna Pease ¹

Alice-Amber Keegan¹, Becky Lambert¹, [...] Jenny Ingram¹, Peter S. Blair¹, Peter J. Fleming¹, Anna Pease ¹

PUBLISHED 29 May 2026

Author details Author details

¹ University of Bristol Medical School, Bristol, England, UK

Alice-Amber Keegan
Roles: Conceptualization, Formal Analysis, Methodology, Writing – Original Draft Preparation

Becky Lambert
Roles: Data Curation, Formal Analysis, Investigation, Methodology, Writing – Review & Editing

Jenny Ingram
Roles: Conceptualization, Investigation, Methodology, Supervision, Writing – Review & Editing

Peter S. Blair
Roles: Conceptualization, Methodology, Supervision, Writing – Review & Editing

Peter J. Fleming
Roles: Conceptualization, Methodology, Supervision, Writing – Review & Editing

Anna Pease
Roles: Conceptualization, Data Curation, Formal Analysis, Funding Acquisition, Investigation, Methodology, Project Administration, Writing – Original Draft Preparation

OPEN PEER REVIEW

REVIEWER STATUS AWAITING PEER REVIEW

Abstract

Background

This paper outlines the steps taken to identify fraudulent responses to an online survey of infant care practices and account for this in the subsequent analysis. A survey of infant care practices in England was conducted in 2022, offering a prize draw as an incentive, and promoted via social media with a public link. During the data cleaning process, it became clear that the survey contained fake responses and a process to identify and remove suspicious submissions was developed.

Methods

This method involved a 5-step process: 1) verifying genuine responses, 2) removing duplicate responses, 3) assessment of red and amber flags, 4) adding back in validated responses, 5) analysis of included and excluded responses.

Results

Overall, 209/3409 (6.1%) of responses were identified as suspicious and removed.

Conclusion

We present our reasoning at each stage of the process and suggest some principles which may be helpful for other research teams faced with a similar predicament.

Plain Language Summary

This article describes how a research team discovered and dealt with fake responses in an online survey about infant sleep practices in England. The survey, part of the Baby Sleep Project, was launched in 2022 to understand how parents care for their babies during sleep. It was shared widely on social media and offered a small prize draw, which made it easy for people to take part—but also made it vulnerable to fraud.

Keywords

Research Methods and Statistics; Community Pediatrics; Epidemiology

Corresponding author: Anna Pease

Competing interests: No competing interests were disclosed.

Grant information: Dr Anna Pease, Research Fellow, NIHR300820, is funded by the NIHR for this research project. The views expressed in this publication are those of the author(s) and not necessarily those of the NIHR, NHS or the UK Department of Health and Social Care.
The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.

Copyright: © Crown copyright, 2026 Keegan AA et al.. This open access work is licensed under the Open Government Licence v3.0

How to cite: Keegan AA, Lambert B, Ingram J et al. Detecting and responding to fake responses to an online infant care survey in the UK [version 1; peer review: awaiting peer review]. NIHR Open Res 2026, 6:52 (https://doi.org/10.3310/nihropenres.14281.1) First published: 29 May 2026, 6:52 (https://doi.org/10.3310/nihropenres.14281.1) Latest published: 29 May 2026, 6:52 (https://doi.org/10.3310/nihropenres.14281.1)

Introduction

Conducting online surveys can feel ‘too good to be true’; they are quick to create and disseminate. Using online surveys allows for rapid collection of large-scale data from geographically dispersed participants at low cost.¹ The use of email and social media to reach respondents can make engaging in research more inclusive and help include underserved populations in research.² Completing surveys online preserves anonymity supporting accurate data collection on sensitive topics.^{2, 3, 4, 3} However, ensuring anonymity makes it difficult to verify that responses are genuine.

The extent of survey fraud and its impact on findings are mostly unknown. Even small levels of fraud can distort results.⁴ Credé⁵ demonstrated that as little as 5% random response can significantly inflate observed correlations. Konstan⁶ excluded 11% of their responses as invalid; the fraudulent data falsely indicated a strong demand for Spanish-language HIV-prevention materials—findings not supported by the validated responses. Without proper validation, such distortions can lead to inappropriate recommendations. Online researchers are described as in an ‘arms race’ with online fraudsters; as new protections are introduced, they are quickly bypassed.⁷ Reports of fraudulent qualitative participants have also emerged.⁸ Despite these risks, survey fraud is seldom addressed in academic methods training, leaving many new researchers unaware of the issues when conducting online surveys.

Online surveys are particularly valuable in research involving families with young infants, many of whom seek reassurance in online communities.⁹ However, topics like infant care can be polarising, and families may fear judgement or social services involvement, especially if discussing potentially risky behaviours. This can lead to social desirability bias, or responding in ways perceived to be acceptable rather than truthful.¹⁰ Anonymity was expected to mitigate this in our study, encouraging honest disclosure. Additionally, the study was designed during the COVID-19 pandemic when face-to-face data collection was uncertain.

Background

The Baby Sleep Project is a series of studies to support vulnerable families and prevent Sudden Unexpected Death in Infancy (SUDI).¹¹ We launched a national survey on 14th June 2022 to understand infant care practices related to safer sleep. The study formed part of an NIHR Fellowship and was conducted via a respected academic online survey platform. The survey collected data on infant care practices relating to safer sleep advice given nationwide in England. As recommended by our public involvement group, an incentive for completing the survey was offered, where respondents could opt-in to be entered into a prize draw to win one of three £50 vouchers. The survey was intended to be conducted face-to-face but due to restrictions imposed by the COVID-19 pandemic at the time, the survey was distributed online.

Survey distribution

The survey was promoted through printed postcards in children’s centres, health professionals in contact with families, and via two well-known children’s charities on Instagram, Facebook and X (formerly Twitter). Posts were public and shareable, allowing the survey link to be widely distributed.

To reduce the risk of survey fraud, participants were informed that prize winners would be contacted by email and asked to verify some data before receiving vouchers. Although multiple fake entries were possible, the risk was considered low due to the topic’s low controversy and lack of other incentives to interfere. The survey platform reported regular security testing, but lacked CAPTCHA protection, support for preventing multiple entries, or the ability to provide IP addresses.

Target respondents

The survey aimed to recruit parents or carers of babies under 1 year of age who lived in England. Survey completion was anonymous, however respondents could leave their email address or phone number at the end of the survey if they wanted to be contacted for an interview, receive the results of the survey or entered into the prize draw.

Patient and public involvement

A group of 14 parent advisors (The Baby Sleep Project Family Advisors) met online to guide the study team on the project as a whole. They contributed to question design, recruitment approaches, and how to interpret the study findings. Members of the group also provided advice about spotting fake responses, and sense checked our ideas about inconsistencies in answers. Their recommendations shaped how we spotted implausible clothing combinations and discrepancies in co-sleeping answers. They also pointed out that completing a survey between midnight and 6 am should be an amber flag, not a red flag, as many parents are up in the night with infants and may have completed the survey genuinely at this time.

Once the survey closed, the researchers noticed unusual patterns in the data. Some answers didn’t make sense, some text responses were just random characters, and some entries came in within seconds of each other. This raised concerns that automated “bots” or individuals trying to win the prize might have submitted fake responses.

To deal with this, the team developed a five-step process:

1. Identify definitely genuine responses (such as people who later took part in an interview or verified their details when claiming a prize).
2. Find and remove duplicates, unless the respondent had twins.
3. Assess ‘red’ and ‘amber’ flags: Apply “red flags” for clear signs of fraud—like nonsense text, invalid emails, or contradictory answers. Apply “amber flags” for suspicious patterns—like unusual completion times, strange email formats, or inconsistent answers. Entries with two or more amber flags were treated as questionable.
4. Add back in validated responses if they had responded to an email to confirm they were genuine.
5. Analyse the included and excluded responses to see how much impact the fraud has had on the results.

In total, about 6% of responses were removed as likely fake. The team found that fake entries tended to give random or highly unusual answers that did not match real-world patterns.

Online survey fraud is increasingly common but rarely discussed. The article recommends that researchers build checks into their study design, use tools like CAPTCHA, and analyse data carefully to spot suspicious patterns.

Methods

Strategies for detecting fraudulent responses

We defined survey fraud using Lawlor and colleagues framework,⁴ including unique participant fraud, alias fraud and suspicious submissions. Unique fraud involves one individual submitting multiple responses, either intentionally (e.g., to claim incentives) or unintentionally (e.g., forgetting they had already participated). Alias fraud involves more sophisticated attempts to mask identity. Suspicious submissions refer to any entries potentially falling into either category.

Our identification process was iterative; checking all responses for expected and potentially deviating responses, reviewing the impact of each criteria on the overall dataset. The identification of genuine responses allowed us to test if we were omitting potentially genuine responses. It was important that we retained as much of the genuine data as possible, whilst still ensuring that suspicious submissions were identified.

We were guided by the Reflect, Expect, Analyse, Label (REAL) framework, developed by Lawlor and colleagues (2021) which asks: (1) Based on your recruitment and distribution practices, how might your survey be vulnerable?; (2) What are the patterns you would expect to see in survey data?; (3) How do expected patterns related to patterns in reality?; (4) What level of suspicion is sufficient to exclude data from your survey? Table 1 shows how we adapted this framework for our purposes.

Table 1. How we applied the REAL Framework to identify suspicious submissions, adapted from Lawler et al. (2021).

	Question	What did we do?
R eflect	Based on your recruitment and distribution practices, how might your survey be vulnerable?	- Assessed the survey topic as ‘low risk’ for fraud - Considered the possibility of unique fraud and established verification process for prize winners to ensure authenticity - The collection of sensitive information (ie. Illicit drug use) meant that little identifying information was collected
E xpect	What are the patterns you would expect to see in survey data?	- Expected survey completion time ~ 10 mins, however parents may take more time to respond due to childcare responsibilities. - Respondents mostly mothers or fathers, high numbers of extended caregivers not expected - Consistency in responses - e.g. ‘last night’s sleep location’ listed under ‘any sleep location’
A nalyze	How do expected patterns related to patterns in reality?	- Random responses that do not relate to the question being asked - Descriptive statistics were used to identify average response time and develop a cut off - Time clustering responses – Reponses submitted with <2 s of each other
L abel	What level of suspicion is sufficient to exclude data from your survey?	Five step process consisting of identifying: - ‘genuine responses’ - duplicate responses - core red flags - amber flags (≥2 amber flags present)

Key ‘red flags’ spotted

Red flags were developed based on literature, expert advice, and technical support from the University of Bristol and the survey platform. Some flags, such as identical nonsense text entries, strongly indicate fraud, while others (e.g., long completion time) may have benign explanations. We also accounted for legitimate use of anonymous email addresses by some parents participating in paid surveys.

Our approach balanced the need to exclude clearly fraudulent responses (core red flags) while identifying combinations of amber flags that could indicate fraud, aiming to preserve genuine data wherever possible.

Results

The survey closed after 5 months on November 11th, 2022, with 3,409 responses. During data cleaning, we identified potential fraud, initially flagged by repeated or irrelevant free text answers. We developed a five-stage process (Fig 1) to identify and remove fraudulent submissions using objective, replicable criteria, aiming for transparency and data integrity.

Figure 1. Flowchart demonstrating the process for identifying suspicious submissions used by the authors.

Step 1 - Verifying genuine responses

Some responses were verified as genuine and excluded from further checks:

1. Participated in a telephone interview (N = 34; one had twins)
2. Won a £50 prize (verified via email) N = 3

Total verified = 37; remaining unverified = 3,372.

Step 2 – Removing duplicate responses

89% (3049/3409) of responses included email addresses, which were used to identify duplicates. Duplicate responses from parents of twins were retained. For others, only the first response was kept. Most duplicates were not considered suspicious, with 100 non-twin duplicate responses removed.

Step 3 – Assessing red and amber flags

We excluded all responses with any core red flags and those with 2 or more amber flags, based on their frequency and association with suspicious patterns.

Core red flags

Three core red flags identified 100 responses within our dataset as suspicious and were removed:

1. Irrelevant/nonsense free text (e.g., strings of random characters)
2. Invalid email addresses (confirmed via an online validation tool)
3. Contradictory responses (e.g., reporting a multiple birth but only one child in total)

Amber flags

Individually, these flags were not enough for exclusion, but 2+ suggested higher suspicion. We derived them by analysing patterns in the 100 red-flagged responses. Table 2 demonstrates the number of responses highlighted as suspicious with the number of red and amber flags present.

1. Unusually long completion time (>15 mins). No responses were under 2 minutes. Descriptive statistics for completion times are shown in Table 3.
2. Emails with 5+ consecutive numbers (e.g., someone865798@gmail.com)
3. Suspicious email usernames (e.g., long strings of letters with no numbers/symbols from gmail.com)
4. Survey completed between midnight and 6 am
5. Completed within 20 seconds of a red-flagged submission
- 6 through 9. Inconsistencies in answers, including:
- 6. reasons for co-sleeping away from home
- 7. Implausible clothing combinations (e.g., baby sleeping in a hat only)
- 8. Discrepancies between “ever” and “last night” sleep surfaces
- 9. Reported co-sleeping not matched to selected sleep locations AND submitted within 20 seconds of another suspicious response

Table 2. Number of responses highlighted as suspicious with each amber flag cut off.

Number of amber flags	Number of responses identified (%)
No amber flags	2523/3171 (79.6%)
1 amber flag	557/3171 (17.6%)
2 amber flags	81/3171 (2.6%)
3 amber flags	28/3171 (0.9%)
4 amber flags	13/3171 (0.4%)
5 amber flags	7/3171 (0.2%)

Note: Denominator was 3171 as 238 responses were removed in stages 1 and 2

Table 3. Descriptive statistics for survey response times, n = 3409.

Mean	SD	Quartiles
Mean	SD	minimum	25% 1^st quart	50% 2^nd quart	75% 3^rd quart	maximum
10 mins	24 mins	2 mins	5 mins	7 mins	9 mins	1146 mins

In total, we identified 230/3272 (7.0%) of the survey submissions as suspicious, 101 with any of the core red flags, 129 with two or more amber flags.

Step 4 – Adding back in validated responses

We contacted 223/230 participants flagged as suspicious but with valid emails, offering the chance to confirm eligibility. Of these, 21 (9%) were verified and reinstated. Seven could not be contacted (no email), leaving 209 permanently excluded and a final dataset of 3,100 responses for analysis.

Step 5 – Analysis of included and excluded responses

We compared the ‘kept’ and ‘removed’ datasets to assess the impact of fraudulent submissions. Table 4 shows differences across three groups: responses included in the final analysis, those excluded for having two or more amber flags, and those excluded for any core red flag. As responses became more suspicious, answer distributions became more uniform—suggesting random selection.

Table 4. Included and excluded responses across key survey variables.

Variable	Category	Clean (N = 3100)	2+ Amber Flags (N = 109)	Red Flags (N = 100)
Maternal Age
	25 or over	2784/3100 (89.8%)	60/109 (55.1%)	53/100 (53.0%)
	21–24 years	252/3100 (8.1%)	45/109 (41.3%)	23/100 (23.0%)
	Under 21 years	64/3100 (2.1%)	4/109 (3.7%)	24/100 (24.0%)
NICU admission
	No	2780/3094 (89.9%)	80/109 (73.4%)	50/100 (50.0%)
	Yes	314/3094 (10.2%)	29/109 (26.6%)	50/100 (50.0%)
Multiple birth
	No, just one baby	3047/3099 (98.3%)	95/109 (87.2%)	53/99 (53.5%)
	Yes, twin or	51/3099 (1.6%)	14/109 (12.8%)	39/99 (39.4%)
	triplet	1/3099 (0.03%)	0/109 (0.0%)	7/99 (7.1%)
Social work involved
	No	3038/3098 (98.1%)	88/108 (81.5%)	61/100 (61.0%)
	Yes	60/3098 (1.9%)	20/108 (18.5%)	39/100 (61.0%)
Usual night position
	Back	2707/3093 (87.5%)	43/107 (40.2%)	50/99 (50.5%)
	Side	288/3093 (9.3%)	41/107 (38.3)	18/99 (18.2%)
	Front	98/3093 (3.2%)	23/107 (21.5%)	31/99 (31.3%)
Change in routine
	No	2645/3087 (85.7%)	77/103 (74.8%)	63/98 (64.3%)
	Yes	442/3087 (14.3%)	26/103 (25.2%)	35/98 (35.7%)

For example, among responses excluded for core red flags, 50% reported neonatal unit admission, compared to 10–15% in population estimates and among included responses. Excluded responses also more often listed someone other than the mother as the respondent and reported multiple births, with 46.5% of core red flag responses selecting twin or triplet births. A similar pattern appeared for social work involvement: 61% of red-flagged responses reported this, compared to an expected national rate of ~3%. Differences also emerged in sleep environment questions, with excluded responses more frequently reporting non-supine infant sleep positions. Reports of changes to the infant’s usual sleep routine “last night” were also more common in excluded groups, again suggesting more random or inconsistent answers.

Discussion

We have presented a description of the strategies used to identify and exclude suspicious submissions to an online survey. We applied the REAL framework to develop a strategy for identification of potentially fraudulent responses that could be objective, transparent and replicable (see Table 1). Overall, 209/3272 responses (6.4%) were identified as suspicious and removed from the dataset. Reported proportions of survey fraud in online surveys have ranged from 100%¹² to 11%.⁶ Pozzar and colleagues¹² recruited survey respondents to an Ovarian Cancer communication study through Social media (Facebook and Twitter) and labelled 94.5% of the responses as fraudulent and 5.5% of responses as suspicious. Quach and colleagues¹³ recruited parents for focus groups about adding the influenza vaccine to school-based immunisation programmes using social media, deal forum websites, online classified ads, mass media and email lists and determined that 36% of their responses were genuine after data cleaning, with 43% of the responses were flagged as multiple submissions with the other responses not meeting their screening criteria. Bowen and colleagues,¹⁴ in their survey of men who had sex with men (MSM), identified that 33% of their sample made multiple submissions.

The methods that we used to identify fraudulent responses in this paper reflected those used by other researchers; screening email addresses,^6,7,14 screening free text responses,⁷ assessing time and speed of survey competition.^3,6,7,14 Goodrich and colleagues (2023) used a high/low priority method of flagging suspicious submissions, similar to the red flag system outlined in this paper.¹⁵

As the level of survey fraud was unexpected, we revisited the reflect step before identifying fraudulent responses. We reviewed our dissemination methods and pinpointed vulnerabilities, noting spikes in suspicious submissions following shares by large organisations on social media. We took a cautious approach, prioritising a smaller, more reliable sample over a larger, riskier one. To recover genuine responses, we contacted excluded participants by email. While we couldn’t guarantee all replies were authentic, responses to verification emails allowed us to rule out alias fraud in those cases.

Reasons for engaging in survey fraud

Suspicious submissions in this survey fell into two categories: unique participant fraud and alias fraud. About 3% of responses were legitimate or accidental duplicates, clearly identified by email address and removed in step 2. These were often submitted months apart, likely by individuals who forgot they had already responded.

Most suspicious entries were alias fraud, likely automated or bot-generated, as they were submitted within seconds of each other and took an unusually long time to complete. Notably, 10% of those flagged did not opt into the prize draw, suggesting the incentive was not always a motivation for fraudulent responses.

Advice for other researchers

Survey fraud is rarely discussed in academic research, but deserves more attention. Online surveys already require cautious interpretation due to limited respondent information, and fraudulent responses can make analysis misleading. Those who conduct online surveys may find our processes helpful when considering interrogation of their own datasets.

1) Recognising the potential for fraud
Fraud should be considered before survey launch, with plans for identifying suspicious responses. Incorporating frameworks like REAL into protocols can help anticipate risks.⁴ Even non-incentivised surveys may be targeted by individuals aiming to disrupt for disruption’s sake, especially when shared publicly on social media. Alternative approaches, such as limiting open access or adding a verification step, may reduce risk.
2) Taking preventative measures
Survey platforms should support tools like CAPTCHA and ballot-box stuffing prevention (e.g., cookies to block repeat entries), though bots may bypass these.¹⁶ Additional techniques include invisible “honeypot” questions, mandatory free-text responses, and logic checks (e.g., comparing age and birthdate). When feasible, personalised single-use survey links and pre-screening can also reduce fraud.
3) Having a fraud analysis plan
Even with safeguards, fraudulent responses may still occur. A pre-specified fraud analysis plan should be in place before data analysis. This may involve obtaining ethical approval for IP access and including fraud checks in protocols from the outset.⁷ It is essential to have a procedure for reviewing suspicious entries and conducting sensitivity analyses to assess data validity.

Conclusion

Following on from our experiences with survey fraud, we have been able to push for further education and clarity for researchers planning to undertake research using online methods. We have been able to communicate the scope of the issue with our institution leading to the development of specific guidance on the prevention, identification and removal of fake responses to survey data and its inclusion in research training. The online survey platform agreed to provide more specific advice about survey distribution and the risk of automated or fraudulent responses in their help pages for users to consider prior to launching surveys.

The reality of online survey research is that survey fraud is inevitable,⁷ therefore it is vitally important that research standards are maintained and that strategies to ensure that research data are genuine are embedded into research protocols. The increasing use of online surveys and the ever-increasing threat of survey fraud has the potential to significantly impact research, resulting in real world consequences. We hope to uphold the integrity of our dataset by being transparent about our process of identifying potentially suspicious submissions and that through our openness we can guide and support other researchers who face the same predicament.

Ethics approval and consent to participate

A favourable ethical opinion was given by the University of Bristol’s Faculty of Health Science Research Ethics Committee (FREC), Ref: 10331 on the 16^th May 2022. All participants provided written informed consent prior to taking part in the study. The study was conducted in accordance with the University of Bristol’s Ethics of Research Policy and Procedure.

Availability of data and materials

Data and supplementary study materials are available at the University of Bristol data repository, data.bris, at https://doi.org/10.5523/bris.39jmqqyvnippe1yt2rhoolkdrr. This is a restricted access dataset. Reviewers needing to access the data for peer review should complete the data access request form at https://bit.ly/data-bris-request. While you will need to identify yourself as part of the process, your identity will not be visible to the authors of the work.

Acknowledgements

Thank you to colleagues at Bristol University who provided support and guidance whilst we were navigating how to deal with suspicious submissions.We are grateful to all the caregivers who completed the survey, to Bristol, North Somerset, South Gloucestershire Integrated Care Board, the Lullaby Trust, Bliss, and the Clinical Research Network for supporting us with recruitment. We are also extremely grateful to our Baby Sleep Project Family Advisors who piloted the survey and advised us on wording: https://babysleepresearch.co.uk/family-involvement/

References

1. Van Selm M, Jankowski NW: Conducting online surveys. Qual. Quant. 2006; 40: 435–456. Publisher Full Text
2. Martinez O, Wu E, Shultz AZ, et al.: Still a hard-to-reach population? Using social media to recruit Latino gay couples for an HIV intervention adaptation study. J. Med. Internet Res. 2014; 16(4): e3311. Publisher Full Text
3. Kobayashi Y, Boudreault P, Hill K, et al.: Using a social marketing framework to evaluate recruitment of a prospective study of genetic counseling and testing for the deaf community. BMC Med. Res. Methodol. 2013; 13: 1–13. Publisher Full Text
4. Lawlor J, Thomas C, Guhin AT, et al.: Suspicious and fraudulent online survey participation: Introducing the REAL framework. Methodological Innovations. 2021; 14(3): 20597991211050467. Publisher Full Text
5. Credé M: Random responding as a threat to the validity of effect size estimates in correlational research. Educ. Psychol. Meas. 2010; 70(4): 596–612. Publisher Full Text
6. Konstan JA, Simon Rosser B, Ross MW, et al.: The story of subject naught: A cautionary but optimistic tale of Internet survey research. J. Comput.-Mediat. Commun. 2005; 10(2).
7. Storozuk A, Ashley M, Delage V, et al.: Got bots? Practical recommendations to protect online survey data from bot attacks. The Quantitative Methods for Psychology. 2020; 16(5): 472–481. Publisher Full Text
8. Sharma P, McPhail SM, Kularatna S, et al.: Navigating the challenges of imposter participants in online qualitative research: lessons learned from a paediatric health services study. BMC Health Serv. Res. 2024; 24(1): 724.Publisher Full Text
9. Lupton D, Pedersen S, Thomas GM: Parenting and digital media: from the early web to contemporary digital society. Sociol. Compass. 2016; 10(8): 730–743. Publisher Full Text
10. Krumpal I: Determinants of social desirability bias in sensitive surveys: a literature review. Qual. Quant. 2013; 47(4): 2025–2047. Publisher Full Text
11. Pease A: The Baby Sleep Project Website 2024.Reference Source
12. Pozzar R, Hammer MJ, Underhill-Blazey M, et al.: Threats of bots and other bad actors to data quality following research participant recruitment through social media: Cross-sectional questionnaire. J. Med. Internet Res. 2020; 22(10): e23021. PubMed Abstract | Publisher Full Text | Free Full Text
13. Quach S, Pereira JA, Russell ML, et al.: The good, bad, and ugly of online recruitment of parents for health-related focus groups: lessons learned. J. Med. Internet Res. 2013; 15(11): e250. PubMed Abstract | Publisher Full Text | Free Full Text
14. Bowen AM, Daniel CM, Williams ML, et al.: Identifying multiple submissions in Internet research: preserving data integrity. AIDS Behav. 2008; 12: 964–973. PubMed Abstract | Publisher Full Text | Free Full Text
15. Goodrich B, Fenton M, Penn J, et al.: Battling bots: Experiences and strategies to mitigate fraudulent responses in online surveys. Appl. Econ. Perspect. Policy. 2023; 45(2): 762–784. Publisher Full Text
16. Griffin M, Martino RJ, LoSchiavo C, et al.: Ensuring survey research data integrity in the era of internet bots. Qual. Quant. 2021; 1–12.
17. Teitcher JE, Bockting WO, Bauermeister JA, et al.: Detecting, preventing, and responding to “fraudsters” in internet research: ethics and tradeoffs. J. Law Med. Ethics. 2015; 43(1): 116–133.
18. Gorman JR, Roberts SC, Dominick SA, et al.: A diversified recruitment approach incorporating social media leads to research participation among young adult-aged female cancer survivors. J. Adolesc. Young Adult Oncol. 2014; 3(2): 59–65. PubMed Abstract | Publisher Full Text | Free Full Text

Comments on this article Comments (0)

Version 1

VERSION 1 PUBLISHED 29 May 2026

Author details Author details

¹ University of Bristol Medical School, Bristol, England, UK

Alice-Amber Keegan
Roles: Conceptualization, Formal Analysis, Methodology, Writing – Original Draft Preparation

Becky Lambert
Roles: Data Curation, Formal Analysis, Investigation, Methodology, Writing – Review & Editing

Jenny Ingram
Roles: Conceptualization, Investigation, Methodology, Supervision, Writing – Review & Editing

Peter S. Blair
Roles: Conceptualization, Methodology, Supervision, Writing – Review & Editing

Peter J. Fleming
Roles: Conceptualization, Methodology, Supervision, Writing – Review & Editing

Anna Pease
Roles: Conceptualization, Data Curation, Formal Analysis, Funding Acquisition, Investigation, Methodology, Project Administration, Writing – Original Draft Preparation

Competing interests

No competing interests were disclosed.

Grant information

Dr Anna Pease, Research Fellow, NIHR300820, is funded by the NIHR for this research project. The views expressed in this publication are those of the author(s) and not necessarily those of the NIHR, NHS or the UK Department of Health and Social Care.
The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.

Article Versions (1)

version 1

Published: 29 May 2026, 6:52

https://doi.org/10.3310/nihropenres.14281.1

Copyright

Download

Export To

metrics

VIEWS

41

downloads

5

Citations

0

SEE MORE DETAILS

CITE

how to cite this article

Keegan AA, Lambert B, Ingram J et al. Detecting and responding to fake responses to an online infant care survey in the UK [version 1; peer review: awaiting peer review]. NIHR Open Res 2026, 6:52 (https://doi.org/10.3310/nihropenres.14281.1)

NOTE: If applicable, it is important to ensure the information in square brackets after the title is included in all citations of this article.

track

receive updates on this article

Track an article to receive email alerts on any updates to this article.

Open Peer Review

Comments on this article Comments (0)

Version 1

VERSION 1 PUBLISHED 29 May 2026

Open Peer Review

Reviewer Status

AWAITING PEER REVIEW

Comments on this article

All Comments(0)

Add a comment

Sign up for content alerts

[1] 1. Van Selm M, Jankowski NW: Conducting online surveys. Qual. Quant. 2006; 40: 435–456. Publisher Full Text

[2] 2. Martinez O, Wu E, Shultz AZ, et al.: Still a hard-to-reach population? Using social media to recruit Latino gay couples for an HIV intervention adaptation study. J. Med. Internet Res. 2014; 16(4): e3311. Publisher Full Text

[3] 3. Kobayashi Y, Boudreault P, Hill K, et al.: Using a social marketing framework to evaluate recruitment of a prospective study of genetic counseling and testing for the deaf community. BMC Med. Res. Methodol. 2013; 13: 1–13. Publisher Full Text

[4] 4. Lawlor J, Thomas C, Guhin AT, et al.: Suspicious and fraudulent online survey participation: Introducing the REAL framework. Methodological Innovations. 2021; 14(3): 20597991211050467. Publisher Full Text

[5] 5. Credé M: Random responding as a threat to the validity of effect size estimates in correlational research. Educ. Psychol. Meas. 2010; 70(4): 596–612. Publisher Full Text

[6] 6. Konstan JA, Simon Rosser B, Ross MW, et al.: The story of subject naught: A cautionary but optimistic tale of Internet survey research. J. Comput.-Mediat. Commun. 2005; 10(2).

[7] 7. Storozuk A, Ashley M, Delage V, et al.: Got bots? Practical recommendations to protect online survey data from bot attacks. The Quantitative Methods for Psychology. 2020; 16(5): 472–481. Publisher Full Text

[8] 8. Sharma P, McPhail SM, Kularatna S, et al.: Navigating the challenges of imposter participants in online qualitative research: lessons learned from a paediatric health services study. BMC Health Serv. Res. 2024; 24(1): 724.Publisher Full Text

[9] 9. Lupton D, Pedersen S, Thomas GM: Parenting and digital media: from the early web to contemporary digital society. Sociol. Compass. 2016; 10(8): 730–743. Publisher Full Text

[10] 10. Krumpal I: Determinants of social desirability bias in sensitive surveys: a literature review. Qual. Quant. 2013; 47(4): 2025–2047. Publisher Full Text

[11] 11. Pease A: The Baby Sleep Project Website 2024.Reference Source

[12] 12. Pozzar R, Hammer MJ, Underhill-Blazey M, et al.: Threats of bots and other bad actors to data quality following research participant recruitment through social media: Cross-sectional questionnaire. J. Med. Internet Res. 2020; 22(10): e23021. PubMed Abstract | Publisher Full Text | Free Full Text

[13] 13. Quach S, Pereira JA, Russell ML, et al.: The good, bad, and ugly of online recruitment of parents for health-related focus groups: lessons learned. J. Med. Internet Res. 2013; 15(11): e250. PubMed Abstract | Publisher Full Text | Free Full Text

[14] 14. Bowen AM, Daniel CM, Williams ML, et al.: Identifying multiple submissions in Internet research: preserving data integrity. AIDS Behav. 2008; 12: 964–973. PubMed Abstract | Publisher Full Text | Free Full Text

[15] 15. Goodrich B, Fenton M, Penn J, et al.: Battling bots: Experiences and strategies to mitigate fraudulent responses in online surveys. Appl. Econ. Perspect. Policy. 2023; 45(2): 762–784. Publisher Full Text

[16] 16. Griffin M, Martino RJ, LoSchiavo C, et al.: Ensuring survey research data integrity in the era of internet bots. Qual. Quant. 2021; 1–12.

[17] 17. Teitcher JE, Bockting WO, Bauermeister JA, et al.: Detecting, preventing, and responding to “fraudsters” in internet research: ethics and tradeoffs. J. Law Med. Ethics. 2015; 43(1): 116–133.

[18] 18. Gorman JR, Roberts SC, Dominick SA, et al.: A diversified recruitment approach incorporating social media leads to research participation among young adult-aged female cancer survivors. J. Adolesc. Young Adult Oncol. 2014; 3(2): 59–65. PubMed Abstract | Publisher Full Text | Free Full Text

Detecting and responding to fake responses to an online infant care survey in the UK

Abstract

Background

Methods

Results

Conclusion

Plain Language Summary

Keywords

Introduction

Background

Survey distribution

Target respondents

Patient and public involvement

Methods

Strategies for detecting fraudulent responses

Table 1. How we applied the REAL Framework to identify suspicious submissions, adapted from Lawler et al. (2021).

Key ‘red flags’ spotted

Results

Figure 1. Flowchart demonstrating the process for identifying suspicious submissions used by the authors.

Step 1 - Verifying genuine responses

Step 2 – Removing duplicate responses

Step 3 – Assessing red and amber flags

Core red flags

Amber flags

Table 2. Number of responses highlighted as suspicious with each amber flag cut off.

Table 3. Descriptive statistics for survey response times, n = 3409.

Step 4 – Adding back in validated responses

Step 5 – Analysis of included and excluded responses

Table 4. Included and excluded responses across key survey variables.

Discussion

Reasons for engaging in survey fraud

Advice for other researchers

Conclusion

Ethics approval and consent to participate

Availability of data and materials

Acknowledgements

References

Comments on this article Comments (0)

Open Peer Review

Comments on this article Comments (0)

Open Peer Review

Reviewer Status

Comments on this article

Competing Interests Policy

Stay Updated

Are you an NIHR-funded researcher?

Thank you!