- Tytuł:
- Compact and hash based variants of the suffix array
- Autorzy:
-
Grabowski, S.
Raniszewski, M. - Powiązania:
- https://bibliotekanauki.pl/articles/202163.pdf
- Data publikacji:
- 2017
- Wydawca:
- Polska Akademia Nauk. Czytelnia Czasopism PAN
- Tematy:
-
string matching
full-text indexing
suffix array
compact indexes
hashing
mieszanie
przeszukiwanie pełnotekstowe
algorytm przyrostowy
indeksy - Opis:
- Full-text indexing aims at building a data structure over a given text capable of efficiently finding arbitrary text patterns, and possibly requiring little space. We propose two suffix array inspired full-text indexes. One, called SA-hash, augments the suffix array with a hash table to speed up pattern searches due to significantly narrowed search interval before the binary search phase. The other, called FBCSA, is a compact data structure, similar to Mäkinen’s compact suffix array (MakCSA), but working on fixed size blocks. Experiments on the widely used Pizza & Chili datasets show that SA-hash is about 2–3 times faster in pattern searches (counts) than the standard suffix array, for the price of requiring 0.2n–1.1n bytes of extra space, where n is the text length. FBCSA, in one of the presented variants, reduces the suffix array size by a factor of about 1.5–2, while it gets close in search times, winning in speed with its competitors known from the literature, MakCSA and LCSA.
- Źródło:
-
Bulletin of the Polish Academy of Sciences. Technical Sciences; 2017, 65, 4; 407-418
0239-7528 - Pojawia się w:
- Bulletin of the Polish Academy of Sciences. Technical Sciences
- Dostawca treści:
- Biblioteka Nauki