- Tytuł:
- Construction of a medical corpus based on information extraction results
- Autorzy:
-
Marciniak, M.
Mykowiecka, A. - Powiązania:
- https://bibliotekanauki.pl/articles/206379.pdf
- Data publikacji:
- 2011
- Wydawca:
- Polska Akademia Nauk. Instytut Badań Systemowych PAN
- Tematy:
-
corpus
semantic annotation
clinical data
information extraction - Opis:
- The paper presents a method of automatic construction of a semantically annotated corpus using the results of a rulebased information extraction (IE) application. Construction of the corpus is based on using existing programs for text tokenization and morphological analysis and combining their results with domain related correction rules. We reuse the specialized IE system to obtain a corpus annotated on the semantic level. The texts included within the corpus are Polish free text clinical data. We present the documents - diabetic patients' discharge records, the structure of the corpus annotation and the methods for obtaining the annotations. Initial evaluations based on the results of manual verification of selected data subset are also presented. The corpus, once manually corrected, is designed to be used for developing supervised machine learning models for IE applications.
- Źródło:
-
Control and Cybernetics; 2011, 40, 2; 337-360
0324-8569 - Pojawia się w:
- Control and Cybernetics
- Dostawca treści:
- Biblioteka Nauki