Temat: Polish language processing - Katalog OPAC zbiorów

Skocz do pozycji: 1.

Tytuł:: Enhancing regular expressions for Polish text processing
Mechanizm rozszerzonych wyrażeń regularnych do przetwarzania tekstów języka polskiego
Autorzy:: Dorosz, K.
Szczerbińska, A.
Powiązania:: https://bibliotekanauki.pl/articles/305579.pdf
Data publikacji:: 2009
Wydawca:: Akademia Górniczo-Hutnicza im. Stanisława Staszica w Krakowie. Wydawnictwo AGH
Tematy:: wyrażenia regularne
język naturalny
język polski
biblioteka CLP
regular expressions
regex
natural language
Polish language processing
CLP library
Opis:: The paper presents proposition of regular expressions engine based on the modified Thompson’s algorithm dedicated to the Polish language processing. The Polish inflectional dictionary has been used for enhancing regular expressions engine and syntax. Instead of using characters as a basic element of regular expressions patterns (as it takes place in BRE or ERE standards) presented tool gives possibility of using words from a natural language or labels describing words grammar properties in regex syntax.
W artykule zaprezentowano propozycje mechanizmu wyrażeń regularnych w oparciu o zmodyfikowany algorytm Thompsona dostosowany do przetwarzania tekstów w języku polskim. Prezentowane wyrażenia regularne wykorzystują słownik fleksyjny języka polskiego i pozwalają na budowę wzorców, w których elementami podstawowymi są wyrazy języka polskiego lub etykiety gramatyczne, a nie znaki (jak to ma miejsce w klasycznych wyrażeniach regularnych standardu BRE czy ERE).
Źródło:: Computer Science; 2009, 10; 19-35
1508-2806
2300-7036
Pojawia się w:: Computer Science
Dostawca treści:: Biblioteka Nauki

Artykuł

Zmień widok

na półce

Skocz do pozycji: 2.

Tytuł:: Dynamic verbs in the Wordnet of Polish
Autorzy:: Dziob, Agnieszka
Piasecki, Maciej
Powiązania:: https://bibliotekanauki.pl/articles/677246.pdf
Data publikacji:: 2018
Wydawca:: Polska Akademia Nauk. Instytut Slawistyki PAN
Tematy:: plWordNet
Wordnet of Polish
lexico-semantic relations
Polish language
dynamic verbs
verbs in wordnet
natural language processing
Opis:: Dynamic verbs in the Wordnet of PolishThe paper presents patterns of co-occurrences of wordnet relations involving verb lexical units in plWordNet - a large wordnet of Polish. The discovered patterns reveal tendencies of selected synset and lexical relations to form regular circular structures of clear semantic meanings. They involve several types of relations, e.g., presupposition, cause, processuality and antonymy, do not have a necessary character (there are exceptions), but can be used in wordnet diagnostics and guidelines for wordnet editors. The analysis is illustrated with numerous positive and negative examples, as well as statistics for verb relations in plWordNet 4.0 emo. Some attempts to a more general, linguistic explanation of the observed phenomena are also made. As a background, plWordNet model of linguistic character is briefly recollected. A special attention is given to the verb part. In addition the description of dynamic verbs by relations and features is discussed in details including relation definitions and substitution tests. Czasowniki dynamiczne w Słowosieci - wordnecie języka polskiego W artykule zostały przedstawione wzorce współwystępowania relacji leksykalno-semantycznych obejmujących czasownikowe jednostki leksykalne w ramach Słowosieci - wielkiego relacyjnego słownika języka polskiego, wordnetu języka polskiego. Tłem obserwacji jest Słowosieć 4.0 emo, dla której omówiono skrótowo system relacji czasownikowych wraz ze statystykami. Szczególną uwagę autorzy poświęcili czasownikom dynamicznym i ich typowym relacjom, dla których przedstawiono testy substytucji z wytycznych do relacyjnego opisu czasownika, zdefiniowanych na potrzeby edycji Słowosieci przez lingwistów. Opisane w artykule wzorce współwystępowania ukazują tendencje niektórych relacji synsetów (tj. zbiorów synonimów) i jednostek leksykalnych (m.in. presupozycji, kauzacji, procesywności i antonimii) do tworzenia regularnych struktur, specyfikujących znaczenie wszystkich jednostek/synsetów, połączonych za pomocą danych relacji. Współwystępowania relacji wg wzorców nie mają charakteru obligatoryjnego, dlatego też w artykule przedstawiono zarówno pozytywne, jak i negatywne przykłady jednostek i synsetów, połączonych ze sobą za pomocą relacji współwystępujących, jak i pewne uwagi natury ogólnej, wskazujące na językowy charakter obserwowanego zjawiska. Oprócz znaczenia poznawczego, związanego ze współzależnościami, jakie zachodzą w obrębie systemu językowego, opis tych regularności ma również znaczenie praktyczne - może być wykorzystany przy diagnostyce wordnetu oraz w wytycznych dla lingwistów.
Źródło:: Cognitive Studies; 2018, 18
2392-2397
Pojawia się w:: Cognitive Studies
Dostawca treści:: Biblioteka Nauki

Artykuł

Zmień widok

na półce

Skocz do pozycji: 3.

Tytuł:: A Multi-Layer Transcription Model – concept outline
Autorzy:: Śledziński, Daniel
Powiązania:: https://bibliotekanauki.pl/articles/2183702.pdf
Data publikacji:: 2022-12-31
Wydawca:: Poznańskie Towarzystwo Przyjaciół Nauk
Tematy:: G2P
grapheme-to-phoneme conversion
Polish language
text processing
Opis:: This paper discusses the assumptions of a Multi-Layer Transcription Model (hereinafter: MLTM). The solution presented is an advanced grapheme-to-phoneme (G2P) conversion method that can be implemented in technical applications, such as automatic speech recognition and synthesis systems. The features of MLTM also facilitate the application of text-to-transcription conversion in linguistic research. The model presented here is the basis for multi-step processing of the orthographic representation of words with those being transcribed gradually. The consecutive stages of the procedure include, among other things, identification of multi-character phonemes, voicing status change, and consonant clusters simplification. The multi-layer model described in this paper makes it possible to assign individual phonetic processes (for example assimilation), as well as other types of transformation, to particular layers. As a result, the set of rules becomes more transparent. Moreover, the rules related to any process can be modified independently of the rules connected with other forms of transformation, provided that the latter have been assigned to a different layer. These properties of the multi-layer transcription model in question provide crucial advantages for the solutions based on it, such as their flexibility and transparency. There are no assumptions in the model about the applicable number of layers, their functions, or the number of rules defined in each layer. A special mechanism used for the implementation of the MLTM concept enables projection of individual characters onto either a phonemic or a phonetic transcript (obtained after processing in the final layer of the MLTM-based system has been completed). The solution presented in this text has been implemented for the Polish language, however, it is not impossible to use the same model for other languages.
Źródło:: Lingua Posnanensis; 2022, 64, 1; 49-71
0079-4740
2083-6090
Pojawia się w:: Lingua Posnanensis
Dostawca treści:: Biblioteka Nauki

Artykuł

Zmień widok

na półce

Skocz do pozycji: 4.

Tytuł:: Language resources for named entity annotation in the National Corpus of Polish
Autorzy:: Savary, A.
Piskorski, J.
Powiązania:: https://bibliotekanauki.pl/articles/206388.pdf
Data publikacji:: 2011
Wydawca:: Polska Akademia Nauk. Instytut Badań Systemowych PAN
Tematy:: natural language processing
proper names
named entities
corpus annotation
Polish National Corpus
SProUT
Opis:: We present the named entity annotation subtask of a project aiming at creating the National Corpus of Polish. We summarize the annotation requirements defined for this corpus, and we discuss how existing lexical resources and grammars for named entity recognition for Polish have been adapted to meet those requirements. We show detailed results of the corpus annotation using the information extraction platform SProUT. We also analyze the errors committed by our knowledge-based method and suggest its further improvements.
Źródło:: Control and Cybernetics; 2011, 40, 2; 361-391
0324-8569
Pojawia się w:: Control and Cybernetics
Dostawca treści:: Biblioteka Nauki

Artykuł

Zmień widok

na półce

Skocz do pozycji: 5.

Tytuł:: Geolocalization of 19th-century villages and cities mentioned in geographical dictionary of the kingdom of Poland
Autorzy:: Jaśkiewicz, G.
Powiązania:: https://bibliotekanauki.pl/articles/305699.pdf
Data publikacji:: 2013
Wydawca:: Akademia Górniczo-Hutnicza im. Stanisława Staszica w Krakowie. Wydawnictwo AGH
Tematy:: natural language processing
geolocalization
statistics
information extraction
Geographical Dictionary of Polish Kingdom and Other Slavic Countries
Opis:: This article presents a method of the rough estimation of geographical coordinates of villages and cities, which is described in the 19th-Century geographical encyclopedia entitled: “The Geographical Dictionary of the Polish Kingdom and Other Slavic Countries”[18]. Described are the algorithm function for estimating location, the tools used to acquire and process necessary information, and the context of this research.
Źródło:: Computer Science; 2013, 14 (3); 423-442
1508-2806
2300-7036
Pojawia się w:: Computer Science
Dostawca treści:: Biblioteka Nauki

Artykuł

Zmień widok

na półce

Informacja

Wyszukujesz frazę "Polish language processing" wg kryterium: Temat

Źródło danych

Dostawca treści

Kolekcja

Rok wydania

Wydawca

Temat

Autor

Typ dokumentu

Język