Temat: documents similarity - Katalog OPAC zbiorów

Skocz do pozycji: 1.

Tytuł:: Finding similar documents in web search results
Identyfikowanie dokumentów podobnych w wynikach wyszukiwania w sieci WWW
Autorzy:: Kużelewska, U.
Powiązania:: https://bibliotekanauki.pl/articles/341131.pdf
Data publikacji:: 2012
Wydawca:: Politechnika Białostocka. Oficyna Wydawnicza Politechniki Białostockiej
Tematy:: grupowanie wyników wyszukiwania
podobieństwo dokumentów
grupowanie snippetów
web search results clustering
documents similarity
snippets clustering
Opis:: Searching the Web is a challenging task. According to the Zamir and Etzioni’s definition, Internet is “unorganized, unstructured and decentralized place”. Although there are powerful search engines available, the number of indexed web pages exceeds 1 trillion [20] and still grows. Most of the search engines return list of documents from their bases sorted according to their relevance to a search query. Such approach is not the best, because the returned list is very long and may contain documents not related to the query. To increase efficiency of a searching process one may identify groups of similar documents from result list. One of the tools to do it are traditional clustering algorithms. The article presents clustering Web search results directly from a search engine as well as sets created from results for different queries. Documents were grouped using the following methods: EM and XMeans.
Przeszukiwanie sieci WWW jest niezmiernie trudnym zadaniem. Według Zamira i Etzioniego Internet to "miejsce bez struktury, niezorganizowane i zdecentralizowane". Chociaz istnieją potężne narzędzia w postaci wyszukiwarek internetowych, ich użycie staje się z czasem trudniejsze, gdyż ilość zaindeksowanych stron internetowych przekracza 1 bln [20] i nadal rośnie. Większość wyszukiwarek generuje wyniki posortowane według ich zgodności z treścią zapytania w postaci bardzo długich list. Takie podejście nie jest najlepszym rozwiązaniem z powodu rozmiaru list oraz zawierania w nich dokumentów nie związanych z zapytaniem. W celu zwiększenia efektywności przeszukiwania Internetu można ˙ zastosowac grupowanie podobnych dokumentów z generowanej przez wyszukiwarki listy wyników. Jednym z takich narzędzi są tradycyjne algorytmy grupujące. W artykule przedstawiono wyniki grupowania dokumentów bezpośrednio z listy zwróconej przez wyszukiwarkę oraz zbiorów dokumentów utworzonych z wyników wyszukiwania dla kilku zapytań. Wykorzystano następujące metody grupujące: EM i XMeans.
Źródło:: Zeszyty Naukowe Politechniki Białostockiej. Informatyka; 2012, 9; 61-76
1644-0331
Pojawia się w:: Zeszyty Naukowe Politechniki Białostockiej. Informatyka
Dostawca treści:: Biblioteka Nauki

Artykuł

Zmień widok

na półce

Skocz do pozycji: 2.

Tytuł:: Statystyczne właściwości tekstów prawnych i ich wykorzystanie w systemach wyszukiwania informacji prawnej
Autorzy:: Petzel, Jacek
Powiązania:: https://bibliotekanauki.pl/articles/1632423.pdf
Data publikacji:: 2021-02-17
Wydawca:: Uniwersytet Warszawski. Wydawnictwa Uniwersytetu Warszawskiego
Tematy:: legal informatics
statistical properties of legal texts
measures of correlation
similarity of documents in legal databases
ERASMUS program
Opis:: The article is devoted to issues related to the use of statistical properties of legal texts for searching legal information. Methods are presented that allow to enlarge the set of found documents by including those semantically close to the initially found ones. Providing the ability to search for such collections of documents allows to better satisfy the needs of the system users. The article presents the theoretical foundations of enlargement operations based on performing specific treatments in the so-called semantic space of terms and semantic space of documents. The article deals particularly with the methods which allow to determine the set of similar documents by using statistical properties of documents. The research carried out under the ERASMUS program conducted by K. van Noortwijk and R. V. de Mulder is presented in detail. A critical analysis of the measures used in this research was carried out, as well as an analysis of the reasons why the proposed methods didn’t lead to fully satisfactory results.
Źródło:: Studia Iuridica; 2020, 83; 187-197
0137-4346
Pojawia się w:: Studia Iuridica
Dostawca treści:: Biblioteka Nauki

Artykuł

Zmień widok

na półce

Informacja

Wyszukujesz frazę "documents similarity" wg kryterium: Temat

Źródło danych

Dostawca treści

Kolekcja

Rok wydania

Wydawca

Temat

Autor

Typ dokumentu

Język