Temat: frequent sequences - Katalog OPAC zbiorów

Skocz do pozycji: 1.

Tytuł:: A search of significant phrases for building topic models in text documents
Autorzy:: Ożdżyński, P.
Zakrzewska, D.
Powiązania:: https://bibliotekanauki.pl/articles/94925.pdf
Data publikacji:: 2016
Wydawca:: Szkoła Główna Gospodarstwa Wiejskiego w Warszawie. Wydawnictwo Szkoły Głównej Gospodarstwa Wiejskiego w Warszawie
Tematy:: topic model
frequent sequences
LDA
Opis:: A huge amount of documents in the digitalized libraries requires efficient methods for exploring contained there information. ìTopic modelingî is considered as one of the most effective among them. In spite of commonly used approaches for finding occurrences of single words, in the paper building topic models based on phrases is pondered. We propose a methodology, which enables to create a set of significant word sequences and thus limiting the search area to phrases which contain them. The methodology is evaluated on experiments performed on real text datasets. Obtained results are compared with those received by using LDA algorithm.
Źródło:: Information Systems in Management; 2016, 5, 2; 205-214
2084-5537
2544-1728
Pojawia się w:: Information Systems in Management
Dostawca treści:: Biblioteka Nauki

Artykuł

Skocz do pozycji: 2.

Tytuł:: Using frequent pattern mining algorithms in text analysis
Autorzy:: Ożdżyński, P.
Zakrzewska, D.
Powiązania:: https://bibliotekanauki.pl/articles/95011.pdf
Data publikacji:: 2017
Wydawca:: Szkoła Główna Gospodarstwa Wiejskiego w Warszawie. Wydawnictwo Szkoły Głównej Gospodarstwa Wiejskiego w Warszawie
Tematy:: GSP
SuffixArray
PrefixSpan
N-Gram
frequent sequences
Opis:: In text mining, effectiveness of methods depends on document representations. The ones based on frequent word sequences are used in such tasks as categorization, clustering and topic modelling. In the paper a comparison of different algorithms for finding frequent word sequences is presented. There are considered techniques dedicated for market basket analysis such as GSP and PrefixSpan as well as a method based on a suffix array. The investigated techniques are compared with the new approach of searching maximum frequent word sequences in document sets. Performance of the algorithms is examined taking into account execution times for the considered test collections.
Źródło:: Information Systems in Management; 2017, 6, 3; 213-222
2084-5537
2544-1728
Pojawia się w:: Information Systems in Management
Dostawca treści:: Biblioteka Nauki

Artykuł

Informacja