Abstract
Objectives: To examine the accuracy and impact of artificial intelligence (AI) software assistance in lung cancer screening using computed tomography (CT).
Methods: A systematic review of CE-marked, AI-based software for automated detection and analysis of nodules in CT lung cancer screening was conducted. Multiple databases including Medline, Embase and Cochrane CENTRAL were searched from 2012 to March 2023. Primary research reporting test accuracy or impact on reading time or clinical management was included. QUADAS2/QUADAS-C were used to assess risk of bias. We undertook narrative synthesis.
Results: Eleven studies evaluating six different AI-based programs and reporting on 19,770 patients were eligible. All were at high risk of bias with multiple applicability concerns. Compared to unaided reading, AI-assisted reading was faster and generally improved sensitivity (+5% to +20% for detecting/categorising actionable nodules; +3% to +15% for detecting/categorising malignant nodules), with lower specificity (-7% to -3% for detecting/categorising actionable nodules; -8% to -2% for detecting/categorising malignant nodules). AI assistance tended to increase the proportion of nodules allocated to higher risk categories. Assuming 0.5% cancer prevalence, these results would translate into additional 150 to 750 cancers detected per million participants but lead to an additional 59,700 to 79,600 participants without cancer receiving unnecessary CT surveillance.
Conclusions: AI assistance in lung cancer screening may improve sensitivity but increases the number of false positive results and unnecessary surveillance. Future research needs to increase the specificity of AI-assisted reading and minimise risk of bias and applicability concerns through improved study design.
Methods: A systematic review of CE-marked, AI-based software for automated detection and analysis of nodules in CT lung cancer screening was conducted. Multiple databases including Medline, Embase and Cochrane CENTRAL were searched from 2012 to March 2023. Primary research reporting test accuracy or impact on reading time or clinical management was included. QUADAS2/QUADAS-C were used to assess risk of bias. We undertook narrative synthesis.
Results: Eleven studies evaluating six different AI-based programs and reporting on 19,770 patients were eligible. All were at high risk of bias with multiple applicability concerns. Compared to unaided reading, AI-assisted reading was faster and generally improved sensitivity (+5% to +20% for detecting/categorising actionable nodules; +3% to +15% for detecting/categorising malignant nodules), with lower specificity (-7% to -3% for detecting/categorising actionable nodules; -8% to -2% for detecting/categorising malignant nodules). AI assistance tended to increase the proportion of nodules allocated to higher risk categories. Assuming 0.5% cancer prevalence, these results would translate into additional 150 to 750 cancers detected per million participants but lead to an additional 59,700 to 79,600 participants without cancer receiving unnecessary CT surveillance.
Conclusions: AI assistance in lung cancer screening may improve sensitivity but increases the number of false positive results and unnecessary surveillance. Future research needs to increase the specificity of AI-assisted reading and minimise risk of bias and applicability concerns through improved study design.
| Original language | English |
|---|---|
| Pages (from-to) | 1040-1049 |
| Number of pages | 10 |
| Journal | Thorax |
| Volume | 79 |
| Issue number | 11 |
| DOIs | |
| Publication status | Published - 25 Sept 2024 |
Bibliographical note
This is an open access article distributed in accordance with the Creative Commons Attribution 4.0 Unported (CC BY 4.0) license, which permits others to copy, redistribute, remix, transform and build upon this work for any purpose, provided the original work is properly cited, a link to the licence is given, and indication of whether changes were made.Funding
This review was funded by the UK National Institute for Health and Care Research (NIHR) Evidence Synthesis Programme (NIHR135325). The funder had no role in the study design; in the collection, analysis, and interpretation of data; in the writing of the report; and in the decision to submit the article for publication. DT and AG are partly funded by the NIHR West Midlands Applied Research Collaboration. STP is funded by the NIHR through a research professorship (NIHR302434). AG is supported by a NIHR Fellowship (NIHR300060).
UN SDGs
This output contributes to the following UN Sustainable Development Goals (SDGs)
-
SDG 3 Good Health and Well-being
Fingerprint
Dive into the research topics of 'Software using artificial intelligence for nodule and cancer detection in CT lung cancer screening: systematic review of test accuracy studies'. Together they form a unique fingerprint.Cite this
- APA
- Standard
- Harvard
- Vancouver
- Author
- BIBTEX
- RIS