Temat: text corpus - Katalog OPAC zbiorów

Skocz do pozycji: 1.

Tytuł:: O projekcie Korpusu Polszczyzny do 1500 roku
Autorzy:: Deptuchowa, Ewa
Jasińska, Katarzyna
Klapper, Magdalena
Kołodziej, Dorota
Powiązania:: https://bibliotekanauki.pl/articles/1630446.pdf
Data publikacji:: 2020-10-30
Wydawca:: Towarzystwo Kultury Języka
Tematy:: electronic text corpus
Old Polish
Mediaeval Polish
inflectional description
Opis:: This paper presents the assumptions of the Corpus of Polish until 1500, which is being developed as part of the project titled Baza leksykalna średniowiecznej polszczyzny (do 1500 roku). Fleksja (Lexical Database of Medieval Polish (until 1500). Infl ection). It introduces the fundamental objectives of the project, namely preparing an infl ectional description of all (infl ected) words from the time until 1500 and building a morphosyntactically annotated collection of texts from the same period. Afterwards, the authors discuss the present digital collections of Old Polish texts. In the main part of the paper, they present the criteria for selecting sources for the Corpus under creation and their elaboration methods, which refer to the solutions developed in the Electronic Corpus of Polish Texts from the 17th and 18th centuries (until 1772). Their major modifi cations aimed to adapt the structural and morphosyntactic annotations for the purpose of describing Mediaeval Polish are discussed on selected examples.
Źródło:: Poradnik Językowy; 2020, 777, 8; 7-16
0551-5343
Pojawia się w:: Poradnik Językowy
Dostawca treści:: Biblioteka Nauki

Artykuł

Zmień widok

na półce

Skocz do pozycji: 2.

Tytuł:: Elektroniczny Korpus Tekstów Polskich z XVII i XVIII w. – problemy teoretyczne i warsztatowe
Autorzy:: Gruszczyński, Włodzimierz
Adamiec, Dorota
Bronikowska, Renata
Wieczorek, Aleksandra
Powiązania:: https://bibliotekanauki.pl/articles/1630441.pdf
Data publikacji:: 2020
Wydawca:: Towarzystwo Kultury Języka
Tematy:: electronic text corpus
historical corpus
17th-18th-century Polish
natural language processing
Opis:: This paper presents the Electronic Corpus of 17th- and 18th-century Polish Texts (KorBa) – a large (13.5-million), annotated historical corpus available online. Its creation was modelled on the assumptions of the National Corpus of Polish (NKJP), yet the specifi c nature of the historical material enforced certain modifi cations of the solutions applied in NKJP, e.g. two forms of text representation (transliteration and transcription) were introduced, the principle of designating foreign-language fragments was adopted, and the tagset was adapted to the description of the grammatical structure of the Middle Polish language. The texts collected in KorBa are diversified in chronological, geographical, stylistic, and thematic terms although, due to e.g. limited access to the material, the postulate of representativeness and sustainability of the corpus was not fully implemented. The work on the corpus was to a large extent automated as a result of using natural language processing tools.
Źródło:: Poradnik Językowy; 2020, 777, 8; 32-51
0551-5343
Pojawia się w:: Poradnik Językowy
Dostawca treści:: Biblioteka Nauki

Artykuł

Zmień widok

na półce

Skocz do pozycji: 3.

Tytuł:: Analiza fleksyjna tekstów historycznych i zmienność fleksji polskiej z perspektywy danych korpusowych
Autorzy:: Woliński, Marcin
Kieraś, Witold
Powiązania:: https://bibliotekanauki.pl/articles/1630443.pdf
Data publikacji:: 2020-10-30
Wydawca:: Towarzystwo Kultury Języka
Tematy:: electronic text corpus
natural language processing
inflection of Polish
history of language
Opis:: The subject matter of this paper is Chronofleks, a computer system (http://chronofleks.nlp.ipipan.waw.pl/) modelling Polish inflection based on a corpus material. The system visualises changes of inflectional paradigms of individual lexemes over time and enables examination of the variability of the frequency of inflected form groups distinguished based on various criteria. Feeding Chronofleks with corpus data required development of IT tools to ensure an inflectional processing sequence of texts analogous to the ones used for modern language; they comprise a transcriber, a morphological analyser, and a tagger. The work was performed on data from three historical periods (1601–1772, 1830–1918, and modern ones) elaborated in independent projects. Therefore, finding a common manner of describing data from the individual periods was a significant element of the work.
Źródło:: Poradnik Językowy; 2020, 777, 8; 66-80
0551-5343
Pojawia się w:: Poradnik Językowy
Dostawca treści:: Biblioteka Nauki

Artykuł

Zmień widok

na półce

Skocz do pozycji: 4.

Tytuł:: Номинации социологических страт: коннотации и оценка
Socjologiczny aspekt nominacji — konotacje i wartościowanie.
Autorzy:: Фролова, Ольга
Powiązania:: https://bibliotekanauki.pl/articles/1023752.pdf
Data publikacji:: 2018-11-26
Wydawca:: Uniwersytet im. Adama Mickiewicza w Poznaniu
Tematy:: lexical qualifiers
text corpus
verbal-nominative units
regular expressions
phrasemes
lexicon of soccer
Opis:: The article examines the semantics of social status nominations in the Soviet and post-Soviet periods. The values of nouns of the nobility, peasants, serfs, merchants, courtiers, and intellectuals are analyzed. Using material from the National Russian Corpus, word combinations with these adjectives are considered, which allows us to reveal the connotations of social status. The adjective 'courtier' has formed the meaning 'close to the power and serving its interests', because it is freely combined with the names of professions whose representatives are deprived of an independent position: journalist, sociologist, director.
Źródło:: Studia Rossica Posnaniensia; 2018, 43; 63-76
0081-6884
Pojawia się w:: Studia Rossica Posnaniensia
Dostawca treści:: Biblioteka Nauki

Artykuł

Zmień widok

na półce

Skocz do pozycji: 5.

Tytuł:: Zwroty czasownikowo-rzeczownikowe o tematyce futbolowej w słowniku rosyjsko-polskim/polsko-rosyjskim
Вербономинальные выражения на футбольную тематику в русско-польском/польско-русском словаре
Autorzy:: Fedorushkov, Yury
Powiązania:: https://bibliotekanauki.pl/articles/1023756.pdf
Data publikacji:: 2018-11-26
Wydawca:: Uniwersytet im. Adama Mickiewicza w Poznaniu
Tematy:: text corpus
verbal-nominative units
regular expressions
phrasemes
lexicon of soccer
lexical qualifiers
Opis:: A football lexicon contains a lot of unregistered phrasemes. The text of the article is devoted to the input method of verbal-nominative phrases from the text corpora into the bilingual (Russian-Polish / Polish-Russian) pocket dictionary of soccer.
Źródło:: Studia Rossica Posnaniensia; 2018, 43; 45-62
0081-6884
Pojawia się w:: Studia Rossica Posnaniensia
Dostawca treści:: Biblioteka Nauki

Artykuł

Zmień widok

na półce

Skocz do pozycji: 6.

Tytuł:: Tekstowy korpus a dalše informaciske srědki wo hornjoserbskej rěči w interneće
The Upper Sorbian text corpus and further sources of information with regard to Upper Sorbian in the Internet
Autorzy:: Wölkowa, Sonja
Powiązania:: https://bibliotekanauki.pl/articles/678589.pdf
Data publikacji:: 2014-12-31
Wydawca:: Polska Akademia Nauk. Instytut Slawistyki PAN
Tematy:: computational lexicography
corpus linguistics
digitalization
text corpus
Upper Sorbian language
digitalizacja
język górnołużycki
korpus językowy
leksykografia komputerowa
lingwistyka korpusowa
Opis:: In the present era of globalisation and the omnipresence of the Internet, Sorbian linguistics faces new challenges along the lines “What is not in the Internet, does not exist”. The demand for digital sources of information with regard to Upper and Lower Sorbian and those accessible online as working tools and reference points for language practice and as a source for academic research increases. As a result of this ongoing development, the Foundation for the Sorbian People established a workgroup called “Sorbian in the new media” at the end of 2012, which has pointed out the creation of an online GermanUpper Sorbian dictionary as the major task in this field of activities. The focus of this article, however, is the Upper-Sorbian text corpus HoTKo, which has been created by the Sorbian Institute and which has been made available in co-operation with the Institute of the Czech National Corpus at the Charles University in Prague. The article presents the history and development of the corpus, its extent and shape as well as its link to or incorporation into further planned digital projects of the Sorbian Institute with regard to the Upper Sorbian language.
Źródło:: Studia z Filologii Polskiej i Słowiańskiej; 2014, 49; 59-71
2392-2435
0081-7090
Pojawia się w:: Studia z Filologii Polskiej i Słowiańskiej
Dostawca treści:: Biblioteka Nauki

Artykuł

Zmień widok

na półce

Skocz do pozycji: 7.

Tytuł:: La féminisation des noms de métiers – prétexte pour une réflexion sociale et culturelle alimentée par des données de corpus dans le cadre des études de philologie
The Feminisation of Job Titles: A Pretext for Social and Cultural Reflection Based on Corpus Data
Autorzy:: Dryjańska, Agnieszka
Powiązania:: https://bibliotekanauki.pl/articles/27827525.pdf
Data publikacji:: 2023-12-16
Wydawca:: Komisja Nauk Filologicznych Oddziału Polskiej Akademii Nauk we Wrocławiu
Tematy:: feminization
job titles
text corpus
teaching French as a foreign language
Féminisation des noms de métiers
corpus de textes
enseignement du français
FLE
Opis:: In the theoretical part of this paper, we propose to draw up a double panorama clarifying, on the one hand, the main linguistic and social issues connected with the feminization of job titles in the French-Polish contrastive perspective, and on the other hand, the data-driven didactic approach favouring the development of lexical and general skills. Secondly, we examine the results of a project adopting this methodology in which the formation of the feminine forms of job titles was analysed by students through lexicographic sources and a text corpus. The results show that lexicographers are not always unanimous about feminine lexemes, and that the language usage reflected in the corpus of texts using contemporary language does not confirm the use of all forms proposed by prestigious institutions and dictionaries. Another advantage of this approach lies in the development of students' autonomy, their capacity for critical analysis and their heuristic skills, which are very important extralinguistic goals in university education.
Źródło:: Academic Journal of Modern Philology; 2023, 20; 37-50
2299-7164
2353-3218
Pojawia się w:: Academic Journal of Modern Philology
Dostawca treści:: Biblioteka Nauki

Artykuł

Zmień widok

na półce

Skocz do pozycji: 8.

Tytuł:: Problemy i korzyści wynikające z automatycznego przetwarzania korpusów - na przykładzie badań z zakresu predykacji rzeczownikowej w języku polskim
Drawbacks and Advantages of the Computer Corpora Processing. Case Study of Nominal Predication in Polish
Désavantages et profits du traitement automatique des corpus à l’exemple des recherches sur la prédication nominale en polonais
Autorzy:: Vetulani, Grażyna
Powiązania:: https://bibliotekanauki.pl/articles/1892144.pdf
Data publikacji:: 2013
Wydawca:: Katolicki Uniwersytet Lubelski Jana Pawła II. Towarzystwo Naukowe KUL
Tematy:: corpus linguistics
text processing
nominal predication
Opis:: This paper reports on our work related to nominal predication in Polish and exploring electronic corpora with help of text processing tools. Various aspects and challenges related with the applied methodology are presented. Despite encountered problems, nowadays, it is practically impossible to imagine solutions ignoring advantages of corpus linguistics. In fact this methodology appeared very efficient. In a relatively short time we developed an application-oriented dictionary of Polish predicative nouns and now we continue to extend it within the same paradigm.
Cet article rend compte des travaux menés depuis un certain temps dans le domaine de la prédication nominale en polonais dans lesquels on exploite des corpus électroniques en utilisant des outils d'analyse automatique du texte. On y présente certaines difficultés qui ont apparu en liaison avec la méthode appliquée, mais on souligne aussi qu’aujourd'hui il est pratiquement impossible de mener des recherches linguistiques autrement et que, finalement, cette méthode s'est avérée très efficace. Dans un laps de temps assez court, elle a permis de construire un dictionnaire des noms prédicatifs du polonais destiné aux applications informatiques et elle contribue à l'heure actuelle au développement du dictionnaire existant.
Źródło:: Roczniki Humanistyczne; 2013, 61, 8; 13-24
0035-7707
Pojawia się w:: Roczniki Humanistyczne
Dostawca treści:: Biblioteka Nauki

Artykuł

Zmień widok

na półce

Skocz do pozycji: 9.

Tytuł:: USING CORPORA TO AID QUALITATIVE TEXT ANALYSIS
Autorzy:: Olejniczak, Jędrzej
Powiązania:: https://bibliotekanauki.pl/articles/628684.pdf
Data publikacji:: 2018
Wydawca:: Fundacja Pro Scientia Publica
Tematy:: Corpora
text analysis
wordlist
keyness
dispersion plot
corpus building
Opis:: Aim. The aim of this paper is to present and exemplify a number of basic uses of corpus-based text analysis tools that can supplement and provide additional insight for an otherwise qualitative analysis of a text. I attempt to show that nowadays certain corpus tools are easily accessible to any researcher and can be used to enrich the results of studies concerned with texts. Methods. This paper comprises the basics of corpus building, the main types of data that can be drawn from a simple corpus and a detailed description of four methods that can aid text analysis: wordlists, concordances, dispersion plots and keywords. Each of those four methods is thoroughly described, including a number of examples of its applications and indicates its possible limitations. Results. The examples provided suggest that even performing a very simple corpus analysis of a text might unveil certain trends and phenomena not noticeable through the classic qualitative text analysis methods (e.g. close reading). The paper argues that corpus research can hence work as an extension of a quantitative analysis (or be its starting point) by examining themes and keywords present in a given text and enrich the results of a qualitative study with a fresh perspective. Finally, the paper claims that basic corpus analysis can, in fact, be successfully employed by researchers who do not have any prior experience with statistics or corpora.
Źródło:: Journal of Education Culture and Society; 2018, 9, 2; 154-164
2081-1640
Pojawia się w:: Journal of Education Culture and Society
Dostawca treści:: Biblioteka Nauki

Artykuł

Zmień widok

na półce

Skocz do pozycji: 10.

Tytuł:: Projektowanie metadanych w korpusie tekstów polskich do 1500 roku – wielopoziomowa struktura informacji
Autorzy:: Leńczuk, Mariusz
Powiązania:: https://bibliotekanauki.pl/articles/1036244.pdf
Data publikacji:: 2020-11-20
Wydawca:: Wydawnictwo Uniwersytetu Śląskiego
Tematy:: language corpus
metadata
text
glosses
13th–15th century
Opis:: The subject of research are selected metadata that should characterize the texts collected in the corpus of the oldest attestations of the Polish language. The author of the article compares and analyses the factors affecting the development of the basic data structure used in synchronic and diachronic corpora (author, title, date of the text, text channel, text classification, source of citation). Without those factors taken into account the disambiguation of the object in the database becomes impossible, and the use of grammatical information is unreliable and impractical. The result of the presented analysis is a proposal to extend the level of description for individual markers.
Źródło:: Forum Lingwistyczne; 2020, 7; 59-69
2449-9587
2450-2758
Pojawia się w:: Forum Lingwistyczne
Dostawca treści:: Biblioteka Nauki

Artykuł

Zmień widok

na półce

Skocz do pozycji: 11.

Tytuł:: Biblijny komponent tekstu cerkiewnosłowiańskiego (przykład staroserbskich zapisów)
Autorzy:: Lis-Wielgosz, Izabela
Powiązania:: https://bibliotekanauki.pl/articles/677902.pdf
Data publikacji:: 2016
Wydawca:: Polska Akademia Nauk. Instytut Slawistyki PAN
Tematy:: canonical corpus
biblical component
Old Church Slavonic text
Old Serbian inscriptions
Opis:: Biblical component of Old Church Slavonic text – the example of the Old Serbian inscriptionsThe paper deals with the problem of the presence of biblical material in Old Church Slavonic texts. While this question has indeed already been broadly and repeatedly discussed by palaeo-Slavists, the developments in medieval studies and the achievements in textological and editorial fields have once again made it an open issue still demanding further elaborations and clarifications. In the discourse of the discipline, the topical question concerns not so much frequency or participation as of the modes of usage of the canonical corpus, that is its concrete and conscious application, whereby it becomes an intrinsic and recognizable component of the literary work. My deliberations are intentionally limited to the example of brief literary forms, namely the Old Serbian inscriptions, which have not yet been considered or even marginalized, although they constitute lofty artistic and autonomous forms of expression, mirroring the whole repertoire of ideas, motifs, constructions and reserves of proceedings and operational strategies characteristic of the Old Church Slavonic literature. Biblijny komponent tekstu cerkiewnosłowiańskiego (przykład staroserbskich zapisów)W artykule podjęto problem obecności materiału biblijnego w tekstach cerkiewnosłowiańskich, który wprawdzie był już wielokrotnie i szeroko omawiany przez paleoslawistów, lecz wraz z rozwojem refleksji mediewistycznej, osiągnięciami na polu tekstologicznym i edytorskim, nadal wydaje się zagadnieniem otwartym, wymagającym dalszych dopełnień i doprecyzowań. W nauce wciąż aktualne jest pytanie nie tyle o frekwencję czy partycypację, ile o modus wykorzystania kanonicznego korpusu, czyli jego konkretne i świadome użycie, takie zastosowanie, że staje się on nieodłącznym i rozpoznawalnym komponentem utworu literackiego. Rozważania na temat biblijnego tworzywa tekstu świadomie ograniczają się do przykładu krótkich form literackich – staroserbskich zapisów, które były dotąd niezauważane czy wręcz marginalizowane, a które stanowią wysoko artystyczne i autonomiczne jednostki wyrazu, odzwierciedlające cały repertuar idei, motywów, konstrukcji oraz charakterystyczny dla literatury cerkiewnosłowiańskiej zasób procedur i strategii operacyjnych.
Źródło:: Slavia Meridionalis; 2016, 16
1233-6173
2392-2400
Pojawia się w:: Slavia Meridionalis
Dostawca treści:: Biblioteka Nauki

Artykuł

Zmień widok

na półce

Skocz do pozycji: 12.

Tytuł:: Expressional accentuations in quotes as applied in Czech scientific and theoretical texts
Autorzy:: Schacherl, Martin
Powiązania:: https://bibliotekanauki.pl/articles/908846.pdf
Data publikacji:: 2018-09-12
Wydawca:: Uniwersytet im. Adama Mickiewicza w Poznaniu
Tematy:: style
theoretical and scientific text
the corpus of Czech written scientific monological communication
activation
differenciation
Opis:: The paper explores the style of a selection of contemporary Czech scientific texts representing contemporary Czech scientific discourse, which comprises professional monological monographs written in Czech in the last decade. Through exploring a re- presentative selection of texts, our research endeavoured to verify the occurrence or non-occurrence of linguistic concern which is deliberately used in contemporary Czech scientific discourse to interrupt (1) the emotional and stylistic neutrality and (2) the relative comprehensiveness, lucidity and clarity of the professional expression. Linguistic concern and stylistic activation in contemporary professional texts are in most cases signalled graphically, by quotation marks. The authors thus indicate to the recipient the occurrence of a stylistic marker, or generally dissimilarity of stylistic norms. Formal separation of the device from the text is an important dialogical means connecting the author and the recipient of scientific discourse. Most often it features simplifying terminology; (non)exact expression; complementarity; the vacillating or transient terminology in the field. The authors use quotation marks to signal graphically not only means that are contradictory to scientific style, such as professionalisms; expressions meaning ‘so as to say’; imprecise, ambigous and vague expressions (which in humanities and social sciences occur in the foundation text); but also means of expression conveying subjectivity and expressivity; sporadically meta- phors. Humanities and social sciences even feature graphically signalled authorian detachment from the content; hyperbole; or possibly irony. The stylistic activity of the quoted means of expression consists in their manifest contextual stylistic value. Manifest activity of diverse linguistic means separated by quotation marks represent the most forceful way to activate the neutral presentation of present-day Czech profesional texts. Their frequency and stylistic activity in all excerpted monographs in the defined groups of fields confirm the authorial ambition to achieve a more original diction of professional discourse. The graphical distinction from the foundation text proves the continuous respect for the stylistic norms of theoretically professional communication.
Źródło:: Bohemistyka; 2017, 4; 303-316
1642-9893
Pojawia się w:: Bohemistyka
Dostawca treści:: Biblioteka Nauki

Artykuł

Zmień widok

na półce

Skocz do pozycji: 13.

Tytuł:: The Interpretation of Quantitative and Corpus Analysis of Thematic Fields in First Collections of Short Stories by Jan Èep
Autorzy:: Zmìlík, Richard
Powiązania:: https://bibliotekanauki.pl/articles/908835.pdf
Data publikacji:: 2018-09-11
Wydawca:: Uniwersytet im. Adama Mickiewicza w Poznaniu
Tematy:: Jan Èep
thematic concentration of a text
quantitative and corpus linguistics and theory of literature
interpretation of quantitative research in literary studies
Opis:: The present paper is a direct continuation of the first part Potencionality of Quantitative and Corpus Analysis to Literary Studies – Toward Methodology (The Analysis of Themtic Fields), which was published in third number of this journal in 2016. In this study we present in detail the exploration of posibilities of interpretation using quantitative and corpus methods and illustrate the way, how specific we can use the results of the named metohod in the literary interpretation.
Źródło:: Bohemistyka; 2017, 3; 221-240
1642-9893
Pojawia się w:: Bohemistyka
Dostawca treści:: Biblioteka Nauki

Artykuł

Zmień widok

na półce

Skocz do pozycji: 14.

Tytuł:: Polifunkcyjność korpusu językowego (na przykładzie platformy Sketch Engine i Narodowego Korpusu Języka Rosyjskiego)
POLYFUNCTIONALITY OF LANGUAGE CORPUS (ON THE MATERIAL OF THE SKETCH ENGINE AND THE RUSSIAN NATIONAL CORPUS)
ПОЛИФУНКЦИОНАЛЬНОСТЬ ЯЗЫКОВОГО КОРПУСА (НА ПРИМЕРЕ ПРОГРАММЫ SKETCH ENGINE И НАЦИОНАЛЬНОГО КОРПУСА РУССКОГО ЯЗЫКА)
Autorzy:: Białek, Ewa Jadwiga
Powiązania:: https://bibliotekanauki.pl/articles/2085222.pdf
Data publikacji:: 2022-03-14
Wydawca:: Polskie Towarzystwo Rusycystyczne
Tematy:: языковой корпус
Sketch Engine
текст
коллокация
языковая картина мира
korpus językowy
tekst
kolokacja
językowy obraz świata
corpus
text
collocation
linguistic worldview
Opis:: Celem artykułu jest pokazanie możliwości wykorzystania korpusu językowego w badaniach naukowych oraz nauczaniu języków obcych. W artykule omówiono zalety Narodowego Korpusu Języka Rosyjskiego i korpusu ruTenTen11 (Sketch Engine). Zbiory tekstów przedstawione w korpusach, które są nowoczesnym narzędziem analizy językowej, mogą posłużyć do badania zmian w systemie językowym, językowego obrazu świata oraz w celach leksykograficznych. Korpusy językowe rejestrują historię słów, dokumentują zmiany ich znaczeń. W artykule analizie poddano leksem i koncept Демократия w języku rosyjskim, a także znaczenie nowego zapożyczenia контент oraz czasownika озвучить. Autorka porównuje także wiedzę o języku i kulturze zawartą w korpusie i słowniku asocjacyjnym. Korpusy dostarczają przydatnych informacji o cechach semantycznych wyrazów, ich kolokacjach. Badając modele łączliwości wyrazów, można zrekonstruować językowy obraz świata. Dane językowe wydobyte z korpusów mogą też wzbogacać słowniki tradycyjne.
Цель статьи – показать возможности использования языкового корпуса в научных исследованиях и обучении иностранным языкам. В статье рассматриваются преимущества Национального корпуса русского языка и корпуса ruTenTen11 (Sketch Engine). Совокупность текстов, представленных в корпусах, являющихся современным инструментом для анализа языка, можно использовать для исследования изменений в языковой системе, а также для изучения языковой картины мира и в лексикографии. Языковые корпусы регистрируют историю слов, документируют изменения в их значениях. Автор статьи анализирует лексему и концепт Демократия в русском языке, а также значения нового заимствования контент и глагола озвучить. Автор сравнивает также знания о языке и культуре, содержащиеся в корпусе и ассоциативном словаре. Корпусы предоставляют полезную информацию о семантических особенностях слов, их коллокациях. С помощью изучения моделей сочетаемости можно реконструировать языковой образ мира. Языковые данные, извлечённые из корпусов, могут дополнять традиционные словари.
The aim of the article is to show the possibilities of using the corpus in scientific research and foreign language teaching. The paper discusses the advantages of the Russian National Corpus and the corpus ruTenTen11 (Sketch Engine). The material presented in the corpus, which is a modern language tool, can be used for investigation of language changes as well as the linguistic worldview. The corpora records the history of words, documents the changes in words’ meanings. The author of the paper analyzes the lexeme and concept Демократия in Russian, the meanings of the new borrowing контент and the verb озвучить. In addition, the use of the language corpora in lexicography is analyzed, the author also compares the knowledge about the language and the culture that is contained in the corpora and the associative dictionarie. The corpora provides useful information about semantic features of words and their collocations. By the patterns of collocability it is possible to reconstruct the linguistic worldview. One of the conclusions is that linguistic data extracted from the corpora can supplement traditional dictionaries.
Źródło:: Przegląd Rusycystyczny; 2022, 1(177); 14-31
0137-298X
Pojawia się w:: Przegląd Rusycystyczny
Dostawca treści:: Biblioteka Nauki

Artykuł

Zmień widok

na półce

Skocz do pozycji: 15.

Tytuł:: Machine translation with Javanese speech levels’ classification
Tłumaczenie maszynowe z klasyfikacją poziomów języka jawajskiego
Autorzy:: Nafalski, A.
Wibawa, A.P.
Powiązania:: https://bibliotekanauki.pl/articles/408899.pdf
Data publikacji:: 2016
Wydawca:: Politechnika Lubelska. Wydawnictwo Politechniki Lubelskiej
Tematy:: expert system
hybrid corpus-based machine translation
Javanese speech levels
text classifie
systemy ekspertowe
hybrydowe tłumaczenie maszynowe
korpus języka
poziomy języka jawajskiego
klasyfikator tekstu
Opis:: A hybrid corpus-based machine processing has been developed to produce a proper Javanese speech level translation. The developed statistical memory-based machine translation shows significantly accurate results. Integration of an automatic text classifier and an expert system is proposed to help Javanese in classifying the speech levels used for a specific interlocutor. Javanese rule-based expert system is designed while naive Bayes classifier is selected after outperforming simple logic probability approach. As a result, the average of translation accuracy (72.3%) indicates that the integrated intelligent interfaces could effectively solve the Javanese language pragmatic translation problems.
Hybrydowy korpus maszynowy dla celów translacji został opracowany w celu uzyskania właściwego tłumaczenia poziomu języka jawajskiego. Rozwinięte tłumaczenie na bazie statystycznej wykazuje wyjątkowo dokładne wyniki. Integracja automatycznego klasyfikatora tekstu i systemu eksperckiego jest propozycja aby pomóc użytkownikom języka jawajskiego w klasyfikacji poziomów mowy wykorzystywanych dla konkretnego rozmówcy. Zaprojektowany system ekspertowy w powiązaniu z klasyfikatorem naive Bayes wykazuje przewagę nad prostym podejściem logiki prawdopodobieństwa. W rezultacie średnia uzyskana dokładność tłumaczenia (72,3%) wskazuje, że zintegrowane inteligentne interfejsy mogą skutecznie rozwiązywać problemy pragmatycznego tłumaczenia języka jawajskiego.
Źródło:: Informatyka, Automatyka, Pomiary w Gospodarce i Ochronie Środowiska; 2016, 1; 21-24
2083-0157
2391-6761
Pojawia się w:: Informatyka, Automatyka, Pomiary w Gospodarce i Ochronie Środowiska
Dostawca treści:: Biblioteka Nauki

Artykuł

Zmień widok

na półce

Informacja

Wyszukujesz frazę "text corpus" wg kryterium: Temat

Źródło danych

Dostawca treści

Kolekcja

Rok wydania

Wydawca

Temat

Autor

Typ dokumentu

Język