Temat: annotation - Katalog OPAC zbiorów

Skocz do pozycji: 1.

Tytuł:: Bulgarian sense-annotated corpus – between the tradition and novelty
Autorzy:: Koeva, Svetla
Powiązania:: https://bibliotekanauki.pl/articles/677294.pdf
Data publikacji:: 2012
Wydawca:: Polska Akademia Nauk. Instytut Slawistyki PAN
Tematy:: corpus studies
corpus annotation
annotation principles
Opis:: Bulgarian sense-annotated corpus – between the tradition and noveltyThe Bulgarian Sense-annotated Corpus (BulSemCor) is compiled according to the general methodology established by the SemCor project. It is a subset of the Brown Corpus of Bulgarian semantically annotated with a corresponding synonym set (synset) in the Bulgarian wordnet. Unlike the bulk of sense-annotated corpora where only (sets of) content words are annotated, in BulSemCor each lexical unit has been assigned a sense. The main contributions achieved in the work on BulSemCor are briefly decides in the presented paper: definition of an annotation schema, compilation of an input corpus, development of a sense-annotated corpus, Bulgarian wordnet enlargement.
Źródło:: Cognitive Studies; 2012, 12
2392-2397
Pojawia się w:: Cognitive Studies
Dostawca treści:: Biblioteka Nauki

Artykuł

Zmień widok

na półce

Skocz do pozycji: 2.

Tytuł:: Multi-level annotation of the specialized Corpus of Dialogs of Disabled Polish Speakers
Autorzy:: Trzebińska, Joanna
Bartoszewicz, Jakub
Powiązania:: https://bibliotekanauki.pl/articles/677159.pdf
Data publikacji:: 2014
Wydawca:: Polska Akademia Nauk. Instytut Slawistyki PAN
Tematy:: speech corpus
pragmatic annotation
semantic annotation
disability
Opis:: Multi-level annotation of the specialized Corpus of Dialogs of Disabled Polish SpeakersWhile Polish language is relatively well represented in general purpose corpora such as National Polish Language Corpus still there are groups of speakers that are underrepresented in reference corpora. One of such sub-groups is the disabled people community. On the other hand there is a growing need for understanding how disability influences social and cognitive abilities, language in particular. In this paper, we present a specialized Corpus of Dialogs of Disabled Speakers. The process of compiling, transcription and annotation of pragmatic, semantic and morphosyntactic features will be described, as well as Corpus applications will be discussed.
Źródło:: Cognitive Studies; 2014, 14
2392-2397
Pojawia się w:: Cognitive Studies
Dostawca treści:: Biblioteka Nauki

Artykuł

Zmień widok

na półce

Skocz do pozycji: 3.

Tytuł:: Towards an event annotated corpus of Polish
Autorzy:: Marcińczuk, Michał
Oleksy, Marcin
Bernaś, Tomasz
Kocoń, Jan
Wolski, Michał
Powiązania:: https://bibliotekanauki.pl/articles/677125.pdf
Data publikacji:: 2015
Wydawca:: Polska Akademia Nauk. Instytut Slawistyki PAN
Tematy:: information extraction
event recognition
corpus annotation
Opis:: Towards an event annotated corpus of PolishThe paper presents a typology of events built on the basis of TimeML specification adapted to Polish language. Some changes were introduced to the definition of the event categories and a motivation for event categorization was formulated. The event annotation task is presented on two levels – ontology level (language independent) and text mentions (language dependant). The various types of event mentions in Polish text are discussed. A procedure for annotation of event mentions in Polish texts is presented and evaluated. In the evaluation a randomly selected set of documents from the Corpus of Wrocław University of Technology (called KPWr) was annotated by two linguists and the annotator agreement was calculated. The evaluation was done in two iterations. After the first evaluation we revised and improved the annotation procedure. The second evaluation showed a significant improvement of the agreement between annotators. The current work was focused on annotation and categorisation of event mentions in text. The future work will be focused on description of event with a set of attributes, arguments and relations.
Źródło:: Cognitive Studies; 2015, 15
2392-2397
Pojawia się w:: Cognitive Studies
Dostawca treści:: Biblioteka Nauki

Artykuł

Zmień widok

na półce

Skocz do pozycji: 4.

Tytuł:: Języki słowiańskie i litewski w korpusach równoległych Clarin-PL
Autorzy:: Koseska-Toszewa, Violetta
Roszko, Roman
Powiązania:: https://bibliotekanauki.pl/articles/678946.pdf
Data publikacji:: 2016
Wydawca:: Polska Akademia Nauk. Instytut Slawistyki PAN
Tematy:: multilingual parallel corpora
semantic annotation
scope quantification
Opis:: Slavic languages and the Lithuanian language in the Clarin-PL parallel corporaThe Clarin Eric and Clarin-PL strategic scientific purpose is to support humanistic research in a multicultural and multilingual Europe. Polish researchers put the emphasis on building a bridge between the Polish language and Polish linguistic technologies and other European languages and their linguistic technologies. So far, the Polish scientific community has mainly focused on Polish-English connections. Clarin-PL has been developing the first and only multilingual corpora of the Polish language in conjunction with other Slavic languages and the Lithuanian language: the Polish-Bulgarian-Russian Parallel Corpus and the Polish- Lithuanian Parallel Corpus. The parallel corpora created by the ISS PAS Corpus Linguistics and Semantics Team break through the existing “canons” and allow scientists access to interlinked multilingual language resources – in the first phase limited to the languages of the three Slavic groups and the Lithuanian language. In the article, the authors present very detailed information on their original system of the semantic annotation of scope quantification in multilingual parallel corpora, hitherto unused in the subject literature. Due to the system’s originality, the semantic annotation is carried out manually. Identification of particular values of scope quantification in a sentence and the hereby presented attempts of its recording are supported by long-term research conducted by an international team of linguists and computer scientists / mathematicians developing the issue of quantification of names, time and aspect in natural languages. Języki słowiańskie i litewski w korpusach równoległych Clarin-PLStrategicznym celem naukowym Clarin ERIC i Clarin-PL jest wspieranie badań humanistycznych w wielokulturowej i wielojęzycznej Europie. Dla polskich badaczy ważna jest budowa pomostu między językiem polskim, polskimi technologiami językowymi a innymi językami europejskimi i na ich rzecz opracowanymi technologiami językowymi. Dotychczas w nauce polskiej największy nacisk był kładziony na powiązania polsko-angielskie. Clarin-PL opracowuje zatem pierwsze jak dotąd wielojęzyczne korpusy języka polskiego w zestawieniu z innymi językami słowiańskimi oraz z językiem litewskim: Korpus równoległy polsko-bułgarsko-rosyjski i Korpus równoległy polsko-litewski. Tworzone przez Zespół Lingwistyki Korpusowej i Semantyki (IS PAN) korpusy równoległe przełamują dotychczasowe „kanony” i udostępniają nauce powiązane wielojęzyczne zasoby – w pierwszym etapie ograniczone do języków trzech grup słowiańskich oraz języka litewskiego. W artykule autorzy przedstawiają bardzo szczegółową informację o zastosowanej po raz pierwszy w literaturze przedmiotu anotacji semantycznej dotyczącej kwantyfikacji zakresowej w wielojęzycznych korpusach równoległych. Z powodu swojego rozległego zakresu i nowatorstwa ta anotacja semantyczna jest nanoszona ręcznie. Identyfikacja poszczególnych wartości kwantyfikacji zakresowej w zdaniu oraz przedstawiane tu próby jej zapisu są poparte wieloletnimi badaniami międzynarodowego zespołu lingwistów i matematyków-informatyków opracowujących zagadnienie kwantyfikacji imion, czasu i aspektu w językach naturalnych.
Źródło:: Studia z Filologii Polskiej i Słowiańskiej; 2016, 51
2392-2435
0081-7090
Pojawia się w:: Studia z Filologii Polskiej i Słowiańskiej
Dostawca treści:: Biblioteka Nauki

Artykuł

Zmień widok

na półce

Skocz do pozycji: 5.

Tytuł:: Experimental Polish-Lithuanian Corpus with the Semantic Annotation Elements
Autorzy:: Roszko, Danuta
Roszko, Roman
Powiązania:: https://bibliotekanauki.pl/articles/677259.pdf
Data publikacji:: 2013
Wydawca:: Polska Akademia Nauk. Instytut Slawistyki PAN
Tematy:: corpora
parallel and comparable corpora
annotation
Polish
Lithuanian
Opis:: Experimental Polish-Lithuanian Corpus with the Semantic Annotation ElementsIn the article the authors present the experimental Polish-Lithuanian corpus (ECorpPL-LT) formed for the idea of Polish-Lithuanian theoretical contrastive studies, a Polish-Lithuanian electronic dictionary, and as help for a sworn translator. The semantic annotation being brought into ECorpPL-LT is extremely useful in Polish-Lithuanian contrastive studies, and also proves helpful in translation work.
Źródło:: Cognitive Studies; 2013, 13
2392-2397
Pojawia się w:: Cognitive Studies
Dostawca treści:: Biblioteka Nauki

Artykuł

Zmień widok

na półce

Skocz do pozycji: 6.

Tytuł:: Abstrakt i adnotacja jako element opisu dokumentu w bazie iSybislaw
Abstract and annotation as an element of bibliographic description in the iSybislaw database
Autorzy:: Kowalski, Paweł
Powiązania:: https://bibliotekanauki.pl/articles/965802.pdf
Data publikacji:: 2014-12-31
Wydawca:: Polska Akademia Nauk. Instytut Slawistyki PAN
Tematy:: abstract
annotation
bibliographic description
database
information retrieval system
iSybislaw
summary
Opis:: Abstract (sometimes called summary) of a scientific publication is a brief text that contains keywords. It is one of the elements of a bibliographic description in the bibliographic database iSybislaw – a modern information retrieval system. In the paper definitions of terms such as abstract, annotation and summary along with their constitutive elements are presented. A characteristics of such short texts inserted in the iSybislaw database in the fields Abstract and Abstract 2 is also given. Based on some examples excerpted from the iSybislaw system a typology of short texts, which are elements of the database bibliographic description, is proposed. The material allows to list three kinds of texts that are being used in the iSybislaw database: annotations, abstracts and biographic annotations.
Przedmiotem analizy artykułu są krótkie teksty, które stanowią jeden z elementów opisu bibliograficznego w systemie wyszukiwawczym iSybislaw. W praktyce naukowej używane są różne terminy odnoszące się do takich tekstów (abstrakt, adnotacja, streszczenie). Autor podaje ich definicje oraz wskazuje elementy konstytutywne. Na podstawie przykładów wyekscerpowanych z systemu iSybislaw przedstawia ich typologię oraz omawia miejsce i funkcje w opisie bibliograficznym.
Źródło:: Studia z Filologii Polskiej i Słowiańskiej; 2014, 49; 88-98
2392-2435
0081-7090
Pojawia się w:: Studia z Filologii Polskiej i Słowiańskiej
Dostawca treści:: Biblioteka Nauki

Artykuł

Zmień widok

na półce

Skocz do pozycji: 7.

Tytuł:: Experimental Corpus of the Lithuanian Local Dialect of Punsk in Poland. Examples of the Lexical and Semantic Annotation
Autorzy:: Roszko, Danuta
Powiązania:: https://bibliotekanauki.pl/articles/677261.pdf
Data publikacji:: 2013
Wydawca:: Polska Akademia Nauk. Instytut Slawistyki PAN
Tematy:: corpora
annotation
Lithuanian local dialect of Punsk in Poland
experimental dialectal corpus
Opis:: Experimental Corpus of the Lithuanian Local Dialect of Punsk in Poland. Examples of the Lexical and Semantic AnnotationIn the article the author describes the experimental corpus of the Lithuanian local dialect of Puńsk in Poland (ECorp-of-Punsk). It is the first corpus of this type for the Lithuanian local dialect. The corpus consists of three subcorpora. The first one (referred to as fundamental) contains utterances given by Lithuanians in the local dialect, the second one – utterances given by Lithuanians in Polish, the third one – aligned Polish-dialectal texts. The texts recorded in the years 1986–2012 have been included in the Ecorp-of-Punsk resources.
Źródło:: Cognitive Studies; 2013, 13
2392-2397
Pojawia się w:: Cognitive Studies
Dostawca treści:: Biblioteka Nauki

Artykuł

Zmień widok

na półce

Skocz do pozycji: 8.

Tytuł:: Trilingual aligned corpus – current state and new applications
Autorzy:: Dimitrova, Ludmila
Koseska, Violetta
Roszko, Danuta
Roszko, Roman
Powiązania:: https://bibliotekanauki.pl/articles/967220.pdf
Data publikacji:: 2014
Wydawca:: Polska Akademia Nauk. Instytut Slawistyki PAN
Tematy:: aligned trilingual corpus
digital resources
event
Petri net theory
semantic annotation
state
Opis:: Trilingual aligned corpus – current state and new applicationsThis article describes current state of a trilingual parallel corpus consisted of texts in two Slavic (Bulgarian and Polish) and one Baltic language (Lithuanian). The corpus contains original literary texts (fiction, novels, and short stories) in one of the three languages with translations to the other two, and texts in other languages translated into Bulgarian, Polish, and Lithuanian. A part of the texts are aligned at the sentence level. The authors propose a semantic annotation of verbs appearing in these aligned texts that will facilitate contrastive studies of natural languages. A theoretical background for the proposed semantic annotation is briefly also discussed.
Źródło:: Cognitive Studies; 2014, 14
2392-2397
Pojawia się w:: Cognitive Studies
Dostawca treści:: Biblioteka Nauki

Artykuł

Zmień widok

na półce

Skocz do pozycji: 9.

Tytuł:: Multilingual digital resources with Bulgarian language
Autorzy:: Dimitrova, Ludmila
Powiązania:: https://bibliotekanauki.pl/articles/677179.pdf
Data publikacji:: 2010
Wydawca:: Polska Akademia Nauk. Instytut Slawistyki PAN
Tematy:: corpora (parallel
comparable
aligned)
corpus annotation
digital dictionaries
lexical databases
morpho-syntactic specifications
Opis:: Multilingual digital resources with Bulgarian languageThe paper presents in brief Bulgarian language resources as a part of multilingual digital resources developed in the frame of some international projects, among them parallel annotated and aligned corpora, comparable corpora, morpho-syntactic specifications for corpora annotation and dictionaries encoding, lexicons, lexical databases, and electronic dictionaries.
Źródło:: Cognitive Studies; 2010, 10
2392-2397
Pojawia się w:: Cognitive Studies
Dostawca treści:: Biblioteka Nauki

Artykuł

Zmień widok

na półce

Skocz do pozycji: 10.

Tytuł:: Application of multilingual corpus in contrastive studies (on the example of the Bulgarian-Polish-Lithuanian parallel corpus)
Autorzy:: Dimitrova, Ludmila
Koseska-Toszewa, Violetta
Roszko, Danuta
Roszko, Roman
Powiązania:: https://bibliotekanauki.pl/articles/677184.pdf
Data publikacji:: 2010
Wydawca:: Polska Akademia Nauk. Instytut Slawistyki PAN
Tematy:: multilingual electronic corpora
parallel and comparable corpora
corpus annotation
lexical databases
multilingual electronic dictionaries
Opis:: Application of multilingual corpus in contrastive studies (on the example of the Bulgarian-Polish-Lithuanian parallel corpus)In this paper we present applications of a trilingual corpus in language research. Comparative and contrastive studies of Polish and Bulgarian as well as Polish and Lithuanian have been already conducted, but up to the best of our knowledge no such studies exist for Bulgarian and Lithuanian. On the one hand, it is interesting to note that two Slavic languages are compared to a Baltic language (Lithuanian). On the other hand, the three languages are marginally present in the EU because of the later ascension of the three countries to the EU. The paper shortly describes the first electronic Bulgarian–Polish–Lithuanian experimental corpus, currently under development only for research. We also focus our attention on the morphosyntactic annotation of the parallel trilingual corpus according to the Corpus Encoding Standard: we present a review of the Part-of-Speech (POS) classification of the participle in the three languages – Bulgarian, Polish, and Lithuanian in comparison to another POS, the adjective. We briefly discuss tagsets for corpus annotation from the point of view of possible unification in the future with some examples.
Źródło:: Cognitive Studies; 2010, 10
2392-2397
Pojawia się w:: Cognitive Studies
Dostawca treści:: Biblioteka Nauki

Artykuł

Zmień widok

na półce

Skocz do pozycji: 11.

Tytuł:: Web-Application for the Presentation of Bilingual Corpora (Focusing on Bulgarian as One of the Two Paired Languages)
Autorzy:: Dimitrova, Ludmila
Dutsova, Ralitsa
Powiązania:: https://bibliotekanauki.pl/articles/677223.pdf
Data publikacji:: 2013
Wydawca:: Polska Akademia Nauk. Instytut Slawistyki PAN
Tematy:: parallel corpus
aligned corpus
concordance
linguistic annotation
lemmatization
POS-tagging
web-interface
web-application
Opis:: Web-Application for the Presentation of Bilingual Corpora (Focusing on Bulgarian as One of the Two Paired Languages)This paper briefly presents a web-application for the presentation of bilingual aligned corpora focusing on Bulgarian as one the two paired languages. The focus is given to the description of the software tools and user interface. The software is developed in IMI-BAS and will be hosted on a server there. Some examples of the usage of the web-application for the presentation of a Bulgarian-Polish aligned corpus are included.
Źródło:: Cognitive Studies; 2013, 13
2392-2397
Pojawia się w:: Cognitive Studies
Dostawca treści:: Biblioteka Nauki

Artykuł

Zmień widok

na półce

Skocz do pozycji: 12.

Tytuł:: Semantics, contrastive linguistics and parallel corpora
Autorzy:: Koseska, Violetta
Powiązania:: https://bibliotekanauki.pl/articles/967225.pdf
Data publikacji:: 2014
Wydawca:: Polska Akademia Nauk. Instytut Slawistyki PAN
Tematy:: contrastive studies
online dictionary
parallel corpora
direct approach to semantics
semantic interlanguage
Petri nets
semantic annotation
Opis:: Semantics, contrastive linguistics and parallel corporaIn view of the ambiguity of the term “semantics”, the author shows the differences between the traditional lexical semantics and the contemporary semantics in the light of various semantic schools. She examines semantics differently in connection with contrastive studies where the description must necessary go from the meaning towards the linguistic form, whereas in traditional contrastive studies the description proceeded from the form towards the meaning. This requirement regarding theoretical contrastive studies necessitates construction of a semantic interlanguage, rather than only singling out universal semantic categories expressed with various language means. Such studies can be strongly supported by parallel corpora. However, in order to make them useful for linguists in manual and computer translations, as well as in the development of dictionaries, including online ones, we need not only formal, often automatic, annotation of texts, but also semantic annotation - which is unfortunately manual. In the article we focus on semantic annotation concerning time, aspect and quantification of names and predicates in the whole semantic structure of the sentence on the example of the “Polish-Bulgarian-Russian parallel corpus”.
Źródło:: Cognitive Studies; 2014, 14
2392-2397
Pojawia się w:: Cognitive Studies
Dostawca treści:: Biblioteka Nauki

Artykuł

Zmień widok

na półce

Skocz do pozycji: 13.

Tytuł:: About Certain Semantic Annotation in Parallel Corpora
Autorzy:: Koseska-Toszewa, Violetta
Powiązania:: https://bibliotekanauki.pl/articles/677255.pdf
Data publikacji:: 2013
Wydawca:: Polska Akademia Nauk. Instytut Slawistyki PAN
Tematy:: direct approach to semantics
semantic annotation
perfective aspect
inperfective aspect
event
state
Petri nets
parallel corpora
contrastive linguistics
Opis:: About Certain Semantic Annotation in Parallel CorporaThe semantic notation analyzed in this works is contained in the second stream of semantic theories presented here – in the direct approach semantics. We used this stream in our work on the Bulgarian-Polish Contrastive Grammar. Our semantic notation distinguishes quantificational meanings of names and predicates, and indicates aspectual and temporal meanings of verbs. It relies on logical scope-based quantification and on the contemporary theory of processes, known as “Petri nets”. Thanks to it, we can distinguish precisely between a language form and its contents, e.g. a perfective verb form has two meanings: an event or a sequence of events and states, finally ended with an event. An imperfective verb form also has two meanings: a state or a sequence of states and events, finally ended with a state. In turn, names are quantified universally or existentially when they are “undefined”, and uniquely (using the iota operator) when they are “defined”. A fact worth emphasizing is the possibility of quantifying not only names, but also the predicate, and then quantification concerns time and aspect. This is a novum in elaborating sentence-level semantics in parallel corpora. For this reason, our semantic notation is manual. We are hoping that it will raise the interest of computer scientists working on automatic methods for processing the given natural languages. Semantic annotation defined like in this work will facilitate contrastive studies of natural languages, and this in turn will verify the results of those studies, and will certainly facilitate human and machine translations.
Źródło:: Cognitive Studies; 2013, 13
2392-2397
Pojawia się w:: Cognitive Studies
Dostawca treści:: Biblioteka Nauki

Artykuł

Zmień widok

na półce

Skocz do pozycji: 14.

Tytuł:: On Semantic Annotation in Clarin-PL Parallel Corpora
Autorzy:: Koseska-Toszewa, Violetta
Roszko, Roman
Powiązania:: https://bibliotekanauki.pl/articles/677121.pdf
Data publikacji:: 2015
Wydawca:: Polska Akademia Nauk. Instytut Slawistyki PAN
Tematy:: manual semantic annotation
semantic definiteness/indefiniteness category
logical quantification
uniqueness
existentiality
universality
elements of the semantic category of time
event
state
sequence of events and states finally ended with an event
Opis:: On Semantic Annotation in Clarin-PL Parallel CorporaIn the article, the authors present a proposal for semantic annotation in Clarin-PL parallel corpora: Polish-Bulgarian-Russian and Polish-Lithuanian ones. Semantic annotation of quantification is a novum in developing sentence level semantics in multilingual parallel corpora. This is why our semantic annotation is manual. The authors hope it will be interesting to IT specialists working on automatic processing of the given natural languages. Semantic annotation defined the way it is defined here will make contrastive studies of natural languages more efficient, which in turn will help verify the results of those studies, and will certainly improve human and machine translations.
Źródło:: Cognitive Studies; 2015, 15
2392-2397
Pojawia się w:: Cognitive Studies
Dostawca treści:: Biblioteka Nauki

Artykuł

Zmień widok

na półce

Informacja

Wyszukujesz frazę "annotation" wg kryterium: Temat

Źródło danych

Dostawca treści

Kolekcja

Rok wydania

Wydawca

Temat

Autor

Typ dokumentu

Język