Building the Model. | Pathology & Laboratory Medicine

Title	Building the Model.
Publication Type	Journal Article
Year of Publication	2023
Authors	Yang HS, Rhoads DD, Sepulveda J, Zang C, Chadburn A, Wang F
Journal	Arch Pathol Lab Med
Volume	147
Issue	7
Pagination	826-836
Date Published	2023 Jul 01
ISSN	1543-2165
Keywords	Computer Simulation, Humans, Machine Learning
Abstract	CONTEXT.—: Machine learning (ML) allows for the analysis of massive quantities of high-dimensional clinical laboratory data, thereby revealing complex patterns and trends. Thus, ML can potentially improve the efficiency of clinical data interpretation and the practice of laboratory medicine. However, the risks of generating biased or unrepresentative models, which can lead to misleading clinical conclusions or overestimation of the model performance, should be recognized. OBJECTIVES.—: To discuss the major components for creating ML models, including data collection, data preprocessing, model development, and model evaluation. We also highlight many of the challenges and pitfalls in developing ML models, which could result in misleading clinical impressions or inaccurate model performance, and provide suggestions and guidance on how to circumvent these challenges. DATA SOURCES.—: The references for this review were identified through searches of the PubMed database, US Food and Drug Administration white papers and guidelines, conference abstracts, and online preprints. CONCLUSIONS.—: With the growing interest in developing and implementing ML models in clinical practice, laboratorians and clinicians need to be educated in order to collect sufficiently large and high-quality data, properly report the data set characteristics, and combine data from multiple institutions with proper normalization. They will also need to assess the reasons for missing values, determine the inclusion or exclusion of outliers, and evaluate the completeness of a data set. In addition, they require the necessary knowledge to select a suitable ML model for a specific clinical question and accurately evaluate the performance of the ML model, based on objective criteria. Domain-specific knowledge is critical in the entire workflow of developing ML models.
DOI	10.5858/arpa.2021-0635-RA
Alternate Journal	Arch Pathol Lab Med
PubMed ID	36223208
PubMed Central ID	PMC10344421
Grant List	R01 MH124740 / MH / NIMH NIH HHS / United States RF1 AG072449 / AG / NIA NIH HHS / United States

Related Faculty:

He Sarina Yang, M.D., Ph.D. Amy Chadburn, M.D.

Google Scholar