Informacja

Drogi użytkowniku, aplikacja do prawidłowego działania wymaga obsługi JavaScript. Proszę włącz obsługę JavaScript w Twojej przeglądarce.

Wyszukujesz frazę "n-grams" wg kryterium: Temat


Wyświetlanie 1-3 z 3
Tytuł:
Detecting approximately duplicate bibliographic records with text algorithms: experience of creating a union catalogue of libraries at the Warsaw University of Technology
Autorzy:
Płoszajski, G.
Powiązania:
https://bibliotekanauki.pl/articles/1954635.pdf
Data publikacji:
2003
Wydawca:
Politechnika Gdańska
Tematy:
duplicate record resolution
n-grams
text algorithms
Opis:
The paper describes a fault-tolerant method of selecting duplicate bibliographic records in catalogues. The method is based on the use of text algorithms; decisions are suggested to librarians who make the final decision. The method was applied to four library catalogues at the Warsaw University of Technology which were compared with the catalogue of the main library. Process of joining catalogues was conducted differently for non-duplicate records and for duplicate ones. Thanks to this method, a significant portion of records in the catalogues of the joining libraries had been found to be duplicate before the catalogues were added. The algorithms proved helpful in assuring high quality of information.
Źródło:
TASK Quarterly. Scientific Bulletin of Academic Computer Centre in Gdansk; 2003, 7, 2; 294-297
1428-6394
Pojawia się w:
TASK Quarterly. Scientific Bulletin of Academic Computer Centre in Gdansk
Dostawca treści:
Biblioteka Nauki
Artykuł
Tytuł:
Refining the Methodology for Investigating the Relationship Between Fluency and the Use of Formulaic Language in Learner Speech
Autorzy:
Guz, Ewa
Powiązania:
https://bibliotekanauki.pl/articles/620701.pdf
Data publikacji:
2016-06-01
Wydawca:
Uniwersytet Łódzki. Wydawnictwo Uniwersytetu Łódzkiego
Tematy:
learner speech
formulaic sequences
phrasemes
n-grams
temporal fluency
speed fluency
breakdown fluency
Opis:
This study is a cross-sectional analysis of the relationship between productive fluency and the use of formulaic sequences in the speech of highly proficient L2 learners. Two samples of learner speech were randomly drawn and analysed. Formulaic sequences were identified on the basis of two distinct procedures: a frequency-based, distributional approach which returned a set of recurrent sequences (n-grams) and an intuition and criterion-based, linguistic procedure which returned a set of phrasemes. Formulaic material was then removed from the data. Breakdown and speed fluency measures were obtained for the following types of speech: baseline (pre-removal), formulaic, non-formulaic (post-removal). The results show significant differences between baseline and post-removal fluency scores for both learners. Also, formulaic speech is produced more fluently than non-formulaic speech. However, the comparison of the fluency scores of n-grams and phrasemes returned inconsistent results with significant differences reported only for one of the samples.
Źródło:
Research in Language; 2016, 14, 2; 95-122
1731-7533
Pojawia się w:
Research in Language
Dostawca treści:
Biblioteka Nauki
Artykuł
Tytuł:
The role of word and n-gram frequency analysis in inference of the content of scientific publication
Autorzy:
Zdonek, Iwona
Powiązania:
https://bibliotekanauki.pl/articles/1931609.pdf
Data publikacji:
2020
Wydawca:
Politechnika Śląska. Wydawnictwo Politechniki Śląskiej
Tematy:
text mining
R
n-grams
scientific publication analysis
eksploracja tekstu
n-gram
analiza publikacji naukowych
Opis:
Purpose: The paper presents an analysis of a scientific publication with regard to the frequency of words and n-grams. The research problem addressed was the question to what extent the text mining analysis of a scientific publication will allow to infer its content. Design/methodology/approach: The main research method is the analysis of tokenized text using word count functions, bigrams, and trigrams in selected sections of a scientific publication. The results of text mining analysis were compared with the classic, non-automated text analysis of the publication. The presented study is a pilot project in the form of a case study. Findings: The proposed method of analyzing a scientific text using an analysis of the frequency of words and n-grams enables inference of the content of the paper with regard to the names of variables involved in the study, the statistical apparatus used and the key literature cited. It should be observed, however, that the discussed method does not make it possible to establish which variables are moderators and which are mediators. Originality/value: In this paper, the text mining technique was used differently in the discussed study than in previous works. The publication was not examined in its entirety, as previous researchers did, but text mining analysis was applied to individual parts of the paper, i.e. the part discussing theoretical foundations of the research and the part presenting the research method, research results, and their discussion. This allowed for obtaining more precise results regarding the content of the publication.
Źródło:
Zeszyty Naukowe. Organizacja i Zarządzanie / Politechnika Śląska; 2020, 142; 21-31
1641-3466
Pojawia się w:
Zeszyty Naukowe. Organizacja i Zarządzanie / Politechnika Śląska
Dostawca treści:
Biblioteka Nauki
Artykuł
    Wyświetlanie 1-3 z 3

    Ta witryna wykorzystuje pliki cookies do przechowywania informacji na Twoim komputerze. Pliki cookies stosujemy w celu świadczenia usług na najwyższym poziomie, w tym w sposób dostosowany do indywidualnych potrzeb. Korzystanie z witryny bez zmiany ustawień dotyczących cookies oznacza, że będą one zamieszczane w Twoim komputerze. W każdym momencie możesz dokonać zmiany ustawień dotyczących cookies