Construction of a medical corpus based on information extraction results

Szczegóły
Opis

Tytuł:: Construction of a medical corpus based on information extraction results
Autorzy:: Marciniak, M.
Mykowiecka, A.
Powiązania:: https://bibliotekanauki.pl/articles/206379.pdf
Data publikacji:: 2011
Wydawca:: Polska Akademia Nauk. Instytut Badań Systemowych PAN
Tematy:: corpus
semantic annotation
clinical data
information extraction
Źródło:: Control and Cybernetics; 2011, 40, 2; 337-360
0324-8569
Język:: angielski
Prawa:: Wszystkie prawa zastrzeżone. Swoboda użytkownika ograniczona do ustawowego zakresu dozwolonego użytku
Dostawca treści:: Biblioteka Nauki
: Artykuł

Przejdź do źródła

The paper presents a method of automatic construction of a semantically annotated corpus using the results of a rulebased information extraction (IE) application. Construction of the corpus is based on using existing programs for text tokenization and morphological analysis and combining their results with domain related correction rules. We reuse the specialized IE system to obtain a corpus annotated on the semantic level. The texts included within the corpus are Polish free text clinical data. We present the documents - diabetic patients' discharge records, the structure of the corpus annotation and the methods for obtaining the annotations. Initial evaluations based on the results of manual verification of selected data subset are also presented. The corpus, once manually corrected, is designed to be used for developing supervised machine learning models for IE applications.

Informacja

Powiązane pozycje