Temat: parallel corpora - Katalog OPAC zbiorów

Skocz do pozycji: 1.

Tytuł:: Języki słowiańskie i litewski w korpusach równoległych Clarin-PL
Autorzy:: Koseska-Toszewa, Violetta
Roszko, Roman
Powiązania:: https://bibliotekanauki.pl/articles/678946.pdf
Data publikacji:: 2016
Wydawca:: Polska Akademia Nauk. Instytut Slawistyki PAN
Tematy:: multilingual parallel corpora
semantic annotation
scope quantification
Opis:: Slavic languages and the Lithuanian language in the Clarin-PL parallel corporaThe Clarin Eric and Clarin-PL strategic scientific purpose is to support humanistic research in a multicultural and multilingual Europe. Polish researchers put the emphasis on building a bridge between the Polish language and Polish linguistic technologies and other European languages and their linguistic technologies. So far, the Polish scientific community has mainly focused on Polish-English connections. Clarin-PL has been developing the first and only multilingual corpora of the Polish language in conjunction with other Slavic languages and the Lithuanian language: the Polish-Bulgarian-Russian Parallel Corpus and the Polish- Lithuanian Parallel Corpus. The parallel corpora created by the ISS PAS Corpus Linguistics and Semantics Team break through the existing “canons” and allow scientists access to interlinked multilingual language resources – in the first phase limited to the languages of the three Slavic groups and the Lithuanian language. In the article, the authors present very detailed information on their original system of the semantic annotation of scope quantification in multilingual parallel corpora, hitherto unused in the subject literature. Due to the system’s originality, the semantic annotation is carried out manually. Identification of particular values of scope quantification in a sentence and the hereby presented attempts of its recording are supported by long-term research conducted by an international team of linguists and computer scientists / mathematicians developing the issue of quantification of names, time and aspect in natural languages. Języki słowiańskie i litewski w korpusach równoległych Clarin-PLStrategicznym celem naukowym Clarin ERIC i Clarin-PL jest wspieranie badań humanistycznych w wielokulturowej i wielojęzycznej Europie. Dla polskich badaczy ważna jest budowa pomostu między językiem polskim, polskimi technologiami językowymi a innymi językami europejskimi i na ich rzecz opracowanymi technologiami językowymi. Dotychczas w nauce polskiej największy nacisk był kładziony na powiązania polsko-angielskie. Clarin-PL opracowuje zatem pierwsze jak dotąd wielojęzyczne korpusy języka polskiego w zestawieniu z innymi językami słowiańskimi oraz z językiem litewskim: Korpus równoległy polsko-bułgarsko-rosyjski i Korpus równoległy polsko-litewski. Tworzone przez Zespół Lingwistyki Korpusowej i Semantyki (IS PAN) korpusy równoległe przełamują dotychczasowe „kanony” i udostępniają nauce powiązane wielojęzyczne zasoby – w pierwszym etapie ograniczone do języków trzech grup słowiańskich oraz języka litewskiego. W artykule autorzy przedstawiają bardzo szczegółową informację o zastosowanej po raz pierwszy w literaturze przedmiotu anotacji semantycznej dotyczącej kwantyfikacji zakresowej w wielojęzycznych korpusach równoległych. Z powodu swojego rozległego zakresu i nowatorstwa ta anotacja semantyczna jest nanoszona ręcznie. Identyfikacja poszczególnych wartości kwantyfikacji zakresowej w zdaniu oraz przedstawiane tu próby jej zapisu są poparte wieloletnimi badaniami międzynarodowego zespołu lingwistów i matematyków-informatyków opracowujących zagadnienie kwantyfikacji imion, czasu i aspektu w językach naturalnych.
Źródło:: Studia z Filologii Polskiej i Słowiańskiej; 2016, 51
2392-2435
0081-7090
Pojawia się w:: Studia z Filologii Polskiej i Słowiańskiej
Dostawca treści:: Biblioteka Nauki

Artykuł

Zmień widok

na półce

Skocz do pozycji: 2.

Tytuł:: From the Problems of Dictionaries and Multi-lingual Corpora
Autorzy:: Koseska-Toszewa, Violetta
Satoła-Staśkowiak, Joanna
Sosnowski, Wojciech
Powiązania:: https://bibliotekanauki.pl/articles/677234.pdf
Data publikacji:: 2013
Wydawca:: Polska Akademia Nauk. Instytut Slawistyki PAN
Tematy:: Bilingual dictionary
trilingual dictionary
online dictionary
traditional classifier
syntactic classifier
semantic classifier
bilingual parallel corpora
trilingual parallel corpora
linguistic quantification
“incomplete quantification”
Opis:: From the Problems of Dictionaries and Multi-lingual CorporaThe article describes the work on a number of dictionaries being developed by the Corpus Linguistics and Semantics Group of the Institute of Slavic PAS. They include “Contemporary Bulgarian-Polish Dictionary”, “Bulgarian-Polish Online Dictionary” and “Russian-Bulgarian-Polish Dictionary”. The dictionaries differ in the numbers of entries, as well as in the different degrees of their connection with parallel corpora being elaborated under the “Clarin” project. All the discussed dictionaries are similar with respect to their use of traditional, syntactic classifiers and of semantic classifiers, introduced for the first time in the existing lexicographical practice. Thanks to the “Polish-Bulgarian-Russian Corpus”, the Group has managed to verify the results of contrasting Polish and Bulgarian in the light of scope-based logical quantification. Thanks to the Russian material added to the trilingual corpus, the researchers have managed to confirm the fact that from the viewpoint of “incomplete quantification” Russian and Polish (synthetic languages) behave similarly, and are opposed to the analytic Bulgarian.
Źródło:: Cognitive Studies; 2013, 13
2392-2397
Pojawia się w:: Cognitive Studies
Dostawca treści:: Biblioteka Nauki

Artykuł

Zmień widok

na półce

Skocz do pozycji: 3.

Tytuł:: About Certain Semantic Annotation in Parallel Corpora
Autorzy:: Koseska-Toszewa, Violetta
Powiązania:: https://bibliotekanauki.pl/articles/677255.pdf
Data publikacji:: 2013
Wydawca:: Polska Akademia Nauk. Instytut Slawistyki PAN
Tematy:: direct approach to semantics
semantic annotation
perfective aspect
inperfective aspect
event
state
Petri nets
parallel corpora
contrastive linguistics
Opis:: About Certain Semantic Annotation in Parallel CorporaThe semantic notation analyzed in this works is contained in the second stream of semantic theories presented here – in the direct approach semantics. We used this stream in our work on the Bulgarian-Polish Contrastive Grammar. Our semantic notation distinguishes quantificational meanings of names and predicates, and indicates aspectual and temporal meanings of verbs. It relies on logical scope-based quantification and on the contemporary theory of processes, known as “Petri nets”. Thanks to it, we can distinguish precisely between a language form and its contents, e.g. a perfective verb form has two meanings: an event or a sequence of events and states, finally ended with an event. An imperfective verb form also has two meanings: a state or a sequence of states and events, finally ended with a state. In turn, names are quantified universally or existentially when they are “undefined”, and uniquely (using the iota operator) when they are “defined”. A fact worth emphasizing is the possibility of quantifying not only names, but also the predicate, and then quantification concerns time and aspect. This is a novum in elaborating sentence-level semantics in parallel corpora. For this reason, our semantic notation is manual. We are hoping that it will raise the interest of computer scientists working on automatic methods for processing the given natural languages. Semantic annotation defined like in this work will facilitate contrastive studies of natural languages, and this in turn will verify the results of those studies, and will certainly facilitate human and machine translations.
Źródło:: Cognitive Studies; 2013, 13
2392-2397
Pojawia się w:: Cognitive Studies
Dostawca treści:: Biblioteka Nauki

Artykuł

Zmień widok

na półce

Skocz do pozycji: 4.

Tytuł:: Dialog between a Lexicographer and a Translator
Autorzy:: Kit, Mark
Koseska-Toszewa, Violetta
Powiązania:: https://bibliotekanauki.pl/articles/677253.pdf
Data publikacji:: 2013
Wydawca:: Polska Akademia Nauk. Instytut Slawistyki PAN
Tematy:: reversibility of dictionaries
lexical structures
multilingual dictionaries
perfective aspect
semantic tags
parallel corpora
corpus linguistics
computational linguistics
contrastive linguistics
Opis:: Dialog between a Lexicographer and a TranslatorThe discussion between the authors of the paper concerns the most pressing issues encountered in natural language semantics, as well as in corpus linguistics and computational linguistics. A broad range of knowledge, allowing linguists and information scientists to work together, is required in these areas. The paper describes some primary problems of human and machine translation caused by gaps between different fields of knowledge. The authors suggest that interdisciplinary approach is required when it comes to contrastive studies in linguistics.
Źródło:: Cognitive Studies; 2013, 13
2392-2397
Pojawia się w:: Cognitive Studies
Dostawca treści:: Biblioteka Nauki

Artykuł

Zmień widok

na półce

Skocz do pozycji: 5.

Tytuł:: Application of multilingual corpus in contrastive studies (on the example of the Bulgarian-Polish-Lithuanian parallel corpus)
Autorzy:: Dimitrova, Ludmila
Koseska-Toszewa, Violetta
Roszko, Danuta
Roszko, Roman
Powiązania:: https://bibliotekanauki.pl/articles/677184.pdf
Data publikacji:: 2010
Wydawca:: Polska Akademia Nauk. Instytut Slawistyki PAN
Tematy:: multilingual electronic corpora
parallel and comparable corpora
corpus annotation
lexical databases
multilingual electronic dictionaries
Opis:: Application of multilingual corpus in contrastive studies (on the example of the Bulgarian-Polish-Lithuanian parallel corpus)In this paper we present applications of a trilingual corpus in language research. Comparative and contrastive studies of Polish and Bulgarian as well as Polish and Lithuanian have been already conducted, but up to the best of our knowledge no such studies exist for Bulgarian and Lithuanian. On the one hand, it is interesting to note that two Slavic languages are compared to a Baltic language (Lithuanian). On the other hand, the three languages are marginally present in the EU because of the later ascension of the three countries to the EU. The paper shortly describes the first electronic Bulgarian–Polish–Lithuanian experimental corpus, currently under development only for research. We also focus our attention on the morphosyntactic annotation of the parallel trilingual corpus according to the Corpus Encoding Standard: we present a review of the Part-of-Speech (POS) classification of the participle in the three languages – Bulgarian, Polish, and Lithuanian in comparison to another POS, the adjective. We briefly discuss tagsets for corpus annotation from the point of view of possible unification in the future with some examples.
Źródło:: Cognitive Studies; 2010, 10
2392-2397
Pojawia się w:: Cognitive Studies
Dostawca treści:: Biblioteka Nauki

Artykuł

Zmień widok

na półce

Skocz do pozycji: 6.

Tytuł:: Web presentation of bilingual corpora (Slovak-Bulgarian and Bulgarian-Polish)
Autorzy:: Garabík, Radovan
Dimitrova, Ludmila
Koseska-Toszewa, Violetta
Powiązania:: https://bibliotekanauki.pl/articles/677063.pdf
Data publikacji:: 2011
Wydawca:: Polska Akademia Nauk. Instytut Slawistyki PAN
Tematy:: Bulgarian
Polish
Slovak
digital language resources
parallel and aligned corpora
web presentation
Opis:: Web presentation of bilingual corpora (Slovak-Bulgarian and Bulgarian-Polish)In this paper we focus on the web-presentation of bilingual corpora in three Slavic languages and their possible applications. Slovak-Bulgarian and Bulgarian-Polish corpora are collected and developed as results of the collaboration in the frameworks of two joint research projects between Institute of Mathematics and Informatics, Bulgarian Academy of Sciences, from one side, and from the other side: Ľ. Štúr Institute of Linguistics, Slovak Academy of Sciences and Institute of Slavic Studies, Polish Academy of Sciences, coordinate by authors of this paper.
Źródło:: Cognitive Studies; 2011, 11
2392-2397
Pojawia się w:: Cognitive Studies
Dostawca treści:: Biblioteka Nauki

Artykuł

Zmień widok

na półce

Skocz do pozycji: 7.

Tytuł:: Bulgarian-Polish parallel digital corpus and quantification of time
Autorzy:: Dimitrova, Ludmila
Koseska-Toszewa, Violetta
Powiązania:: https://bibliotekanauki.pl/articles/677292.pdf
Data publikacji:: 2012
Wydawca:: Polska Akademia Nauk. Instytut Slawistyki PAN
Tematy:: Bulgarian
Polish
digital language resources
parallel and aligned corpora
quantification of time
Opis:: Bulgarian-Polish parallel digital corpus and quantification of timeThe paper presents the current state of the first Bulgarian-Polish parallel and aligned corpus, prepared in the frame of the joint research project “Semantics and Contrastive linguistics with a focus on a bilingual electronic dictionary” between the Institute of Mathematics and Informatics, Bulgarian Academy of Sciences and the Institute of Slavic Studies, Polish Academy of Sciences, coordinated by L. Dimitrova and V. Koseska-Toszewa. In particular, problems related to tense quantification are also discussed
Źródło:: Cognitive Studies; 2012, 12
2392-2397
Pojawia się w:: Cognitive Studies
Dostawca treści:: Biblioteka Nauki

Artykuł

Zmień widok

na półce

Informacja

Wyszukujesz frazę "parallel corpora" wg kryterium: Temat

Źródło danych

Dostawca treści

Kolekcja

Rok wydania

Wydawca

Temat

Autor

Typ dokumentu

Język