TY - JOUR
T1 - A Comprehensive Natural Language Processing Pipeline for the Chronic Lupus Disease
AU - Lilli, Livia
AU - Bosello, Silvia Laura
AU - Antenucci, Laura
AU - Patarnello, Stefano
AU - Ortolan, Augusta
AU - Lenkowicz, Jacopo
AU - Gorini, Marco
AU - Castellino, Gabriella
AU - Cesario, Alfredo
AU - D'Agostino, Maria Antonietta
AU - Masciocchi, Carlotta
PY - 2024
Y1 - 2024
N2 - : Electronic Health Records (EHRs) contain a wealth of unstructured patient data, making it challenging for physicians to do informed decisions. In this paper, we introduce a Natural Language Processing (NLP) approach for the extraction of therapies, diagnosis, and symptoms from ambulatory EHRs of patients with chronic Lupus disease. We aim to demonstrate the effort of a comprehensive pipeline where a rule-based system is combined with text segmentation, transformer-based topic analysis and clinical ontology, in order to enhance text preprocessing and automate rules' identification. Our approach is applied on a sub-cohort of 56 patients, with a total of 750 EHRs written in Italian language, achieving an Accuracy and an F-score over 97% and 90% respectively, in the three extracted domains. This work has the potential to be integrated with EHR systems to automate information extraction, minimizing the human intervention, and providing personalized digital solutions in the chronic Lupus disease domain.
AB - : Electronic Health Records (EHRs) contain a wealth of unstructured patient data, making it challenging for physicians to do informed decisions. In this paper, we introduce a Natural Language Processing (NLP) approach for the extraction of therapies, diagnosis, and symptoms from ambulatory EHRs of patients with chronic Lupus disease. We aim to demonstrate the effort of a comprehensive pipeline where a rule-based system is combined with text segmentation, transformer-based topic analysis and clinical ontology, in order to enhance text preprocessing and automate rules' identification. Our approach is applied on a sub-cohort of 56 patients, with a total of 750 EHRs written in Italian language, achieving an Accuracy and an F-score over 97% and 90% respectively, in the three extracted domains. This work has the potential to be integrated with EHR systems to automate information extraction, minimizing the human intervention, and providing personalized digital solutions in the chronic Lupus disease domain.
KW - Artificial Intelligence (AI)
KW - Electronic Health Record (EHR)
KW - Information Extraction (IE)
KW - Natural Language Processing (NLP)
KW - Systemic Lupus Erythematosus (SLE)
KW - Artificial Intelligence (AI)
KW - Electronic Health Record (EHR)
KW - Information Extraction (IE)
KW - Natural Language Processing (NLP)
KW - Systemic Lupus Erythematosus (SLE)
UR - https://publicatt.unicatt.it/handle/10807/298479
U2 - 10.3233/shti240559
DO - 10.3233/shti240559
M3 - Article
SN - 0926-9630
VL - 316
SP - 909
EP - 913
JO - Studies in Health Technology and Informatics
JF - Studies in Health Technology and Informatics
IS - aug
ER -