A newly developed machine learning-based model is able to accurately predict the diagnoses of myeloid malignancies, according to research published in Blood Advances.
“The differential diagnosis of myeloid malignancies is challenging and subject to interobserver variability,” the authors wrote in their report.
Machine-learning algorithms are learn from relationships, patterns, and trends in data, and modern methods allow researchers to extract the most relevant features, which can then be used to evaluate a given model’s prediction performance when applied in clinical settings.
For the study, the researchers used clinical and next-generation sequencing (NGS) data to develop a machine-learning model for the diagnosis of myelodysplastic syndrome (MDS) and other myeloid malignancies, independent of bone marrow biopsy data.
Study patients were required to have had peripheral complete blood count and differential, bone marrow examination, and NGS data at the time of diagnosis. Diagnoses were made by teams experienced in the diagnosis of the disorders and confirmed with histopathologic examination of bone marrow specimens. NGS analyses included targeted sequencing of 38 genes demonstrated to be clinically relevant to MDS and other myeloid malignancies. Model performance was evaluated with area under the receiver operating characteristic curve (AUROC).
Data were collected from a total of 2697 patients treated for MDS and other myeloid malignancies between 2004 and 2018 at 3 clinics (United States, n=652; Germany, n=1509; and Italy, n=538). The training test and the validation cohorts were composed of the patient data from the US, Italy, and Germany, respectively.
Patient diagnoses were 60.4% MDS, 14.8% chronic myelomonocytic leukemia, 5.3% idiopathic cytopenia of undetermined significance, 3.4% clonal cytopenia of undetermined significance, 4.8% MDS/myeloproliferative neoplasm, 1.5% polycythemia vera, 1.9% essential thrombocythemia, and 3.5% primary myelofibrosis. The most commonly mutated genes in MDS were SF3B1 (26.5%), TET2 (25.3%), and ASXL1 (19.3%).
The investigators characterized numerous associations between NGS findings and clinically important phenotypes. For example, in MDS, ASXL1 mutations were associated with mutations in SRSF2, CBL, and STAG2, which are also involved in epigenetic modification, and SF3B1 mutations were associated with normal karyotype and bone marrow blasts <10%.
The final model included 15 genomic/clinical variables and was able to accurately diagnose MDS and other myeloid malignancies, with an AUROC of 0.95 (95% CI, 0.93-0.97) for the test/training cohort and 0.93 (95% CI, 0.92-0.94) for the validation cohort. The factors used by the model were similar to those used by clinicians, with the number of mutations, percentage of blasts in peripheral blood, absolute monocyte count, JAK2 status, and hemoglobin levels as the top 5.
Limitations of the study included an inability to capture rarer mutations and reflection of only contemporaneous diagnostic standards and definitions of each disease subtype in the model.
“In summary, we describe [a machine learning]-based approach to the diagnosis of myeloid malignancies absent the data typically obtained from a bone marrow biopsy,” concluded the authors. “Our model’s findings and predictions are consistent with known hallmarks of these diseases and demonstrate the potential utility of [machine learning]-based approaches as an additional tool in the upfront evaluation of these diseases.”
Disclosure: One study author declared affiliations with biotech, pharmaceutical, and/or device companies. Please see the original reference for a full list of authors’ disclosures.
Radakovich N, Meggendorfer M, Malcovati L, et al. A geno-clinical decision model for the diagnosis of myelodysplastic syndromes. Blood Adv. 2021;5(21):4361-4369. doi:10.1182/bloodadvances.2021004755