The article concerns integration and disambiguation of data related to the maritime domain. A developed system is described, which collects and merges data about several maritime-related entities (vessels, vessel types, ports, companies etc.) retrieved from different internet sources and feeds the data into a single database. This process is however not trivial. There are few challenges, which need to be faced to successfully conduct it. Firstly, in different sources, entities may be referenced to in different ways, for example, by using different text strings. Additionally, some of these references may be ambiguous, i.e. potentially the reference may point to more than one entity. To enable efficient analysis of data coming from different sources, such ambiguities must be resolved automatically as a preprocessing step, before the data is uploaded to the database and utilized in further computations. The aim of the disambiguation process is to assign artificial, unique identifiers to each entity and then, if possible, automatically assign these identifiers to each data item related to a given entity. In the article, developed methods for resolving such ambiguities are discussed and their evaluation is presented.
Ta witryna wykorzystuje pliki cookies do przechowywania informacji na Twoim komputerze. Pliki cookies stosujemy w celu świadczenia usług na najwyższym poziomie, w tym w sposób dostosowany do indywidualnych potrzeb. Korzystanie z witryny bez zmiany ustawień dotyczących cookies oznacza, że będą one zamieszczane w Twoim komputerze. W każdym momencie możesz dokonać zmiany ustawień dotyczących cookies
Informacja
SZANOWNI CZYTELNICY!
UPRZEJMIE INFORMUJEMY, ŻE BIBLIOTEKA FUNKCJONUJE W NASTĘPUJĄCYCH GODZINACH:
Wypożyczalnia i Czytelnia Główna: poniedziałek – piątek od 9.00 do 19.00