Temat: text mining - Katalog OPAC zbiorów

Skocz do pozycji: 1.

Tytuł:: Information management tools for innovation analysts
Narzędzia zarządzania informacją dla analityków innowacji
Autorzy:: Eito-Brun, R.
Powiązania:: https://bibliotekanauki.pl/articles/256694.pdf
Data publikacji:: 2014
Wydawca:: Sieć Badawcza Łukasiewicz - Instytut Technologii Eksploatacji - Państwowy Instytut Badawczy
Tematy:: innovation
scientometrics
text mining
opinion mining
text visualization
innowacja
naukometria
eksploracja tekstu
badanie opinii
wizualizacja tekstu
Opis:: Innovation management is a knowledge-intensive process that requires dealing with different sources of data to identify relationships between the concepts, techniques, and tools that may led to innovations. Innovation analysts need to handle huge amounts of unstructured information: ideas gathered from internal staff and external partners, research papers and technical reports, patents and applications, etc. All these sources constitute valid inputs to assess the innovativeness of ideas, the feasibility of their implementation, and their potential value in the market. Innovation management discipline has widely used techniques and methods developed in the context of Information Science to support the identification of research trends, assess the outputs of innovation efforts and investments, and monitor the market and the activities made by competitors. The fruitful relationship between Information Science techniques and Innovation management needs to be regularly reviewed as new techniques and tools are designed and made available to the community. In the last years, significant progress has been achieved in areas like scientometrics, text visualization, and opinion mining. This paper provides an overview of these techniques and discusses how they can help professionals involved in innovation programs.
Zarządzanie innowacjami to oparty na wiedzy proces, w którym definiowany jest poziom zależności pomiędzy pomysłami, technikami i narzędziami mogącymi skutkować opracowaniem innowacji. Analityk innowacji musi zarządzać treściami niestrukturalnymi: pomysłami zgromadzonymi od pracowników jak i partnerów, wiedzą pochodzącą z publikacji naukowych i raportów technicznych, patentami i zgłoszeniami patentowymi itp. Wszystkie te źródła stanowią istotny wkład w proces oceny innowacyjności pomysłu, możliwości jego realizacji oraz konkurencyjności rynkowej. W zarządzaniu innowacjami powszechnie stosowane są techniki i metody informatyczne, które wspomagają proces identyfikacji trendów, oceny rezultatów, oszacowania niezbędnych nakładów finansowych czy monitorowania rynku. Oznacza to, że należy regularnie monitorować stan wiedzy i techniki w tym obszarze w celu zapewnienia jak najbardziej owocnej współpracy na styku nauk informatycznych i zarządzania innowacjami. W ostatnich latach znaczący postęp osiągnięto w takich dziedzinach jak naukometria, wizualizacja tekstu i badanie opinii. W artykule dokonano przeglądu tych technik i omówiono sposób, w jaki mogą one wspomóc specjalistów zaangażowanych w realizację innowacyjnych programów.
Źródło:: Problemy Eksploatacji; 2014, 4; 73-82
1232-9312
Pojawia się w:: Problemy Eksploatacji
Dostawca treści:: Biblioteka Nauki

Artykuł

Zmień widok

na półce

Skocz do pozycji: 2.

Tytuł:: ZASTOSOWANIE TECHNIK EKSPLORACJI TEKSTU DO ANALIZY OPINII KONSUMENCKICH
APPLICATION OF TEXT MINING TECHNIQUES FOR THE CUSTOMER REVIEWS ANALYSIS
Autorzy:: Ząbkowski, Tomasz
Powiązania:: https://bibliotekanauki.pl/articles/452951.pdf
Data publikacji:: 2014
Wydawca:: Szkoła Główna Gospodarstwa Wiejskiego w Warszawie. Katedra Ekonometrii i Statystyki
Tematy:: eksploracja tekstu
reguły asocjacyjne
opinie konsumenckie
text mining
association rules
customer reviews
Opis:: W niniejszej publikacji zaproponowano jedną z metod eksploracji danych – reguły asocjacyjne do wykrycia zależności w opiniach konsumenckich, na przykładzie opinii jednego z hoteli amerykańskich. Wykorzystanie tej techniki wynikało m.in. z dużej ilości dostępnych danych oraz faktu, że otrzymane reguły w sposób niezwykle czytelny prezentują zależności znalezione w danych. W badaniu odkryto szereg reguł, które mogą stanowić cenne źródło informacji o jakości usług oraz postrzeganiu obiektu przez klientów korzystających z usług hotelowych.
This paper presents application of one of data mining techniques – association rules to analyze customer reviews, based on the data gathered at one of the American hotels. The application of association rules is due to the large volume of available review data and the fact that the rules can be presented in a very clear and meaningful way. The study resulted in a number of interesting rules that can be a valuable source of information about the quality of services and the perception of the hotel by the clients.
Źródło:: Metody Ilościowe w Badaniach Ekonomicznych; 2014, 15, 4; 101-110
2082-792X
Pojawia się w:: Metody Ilościowe w Badaniach Ekonomicznych
Dostawca treści:: Biblioteka Nauki

Artykuł

Zmień widok

na półce

Skocz do pozycji: 3.

Tytuł:: A case study in text mining of discussion forum posts: Classification with bag of words and global vectors
Autorzy:: Cichosz, P.
Powiązania:: https://bibliotekanauki.pl/articles/330299.pdf
Data publikacji:: 2018
Wydawca:: Uniwersytet Zielonogórski. Oficyna Wydawnicza
Tematy:: text mining
discussion forum
text representation
document classification
word embedding
eksploracja tekstu
forum dyskusyjne
reprezentacja tekstu
klasyfikacja dokumentów
Opis:: Despite the rapid growth of other types of social media, Internet discussion forums remain a highly popular communication channel and a useful source of text data for analyzing user interests and sentiments. Being suited to richer, deeper, and longer discussions than microblogging services, they particularly well reflect topics of long-term, persisting involvement and areas of specialized knowledge or experience. Discovering and characterizing such topics and areas by text mining algorithms is therefore an interesting and useful research direction. This work presents a case study in which selected classification algorithms are applied to posts from a Polish discussion forum devoted to psychoactive substances received from home-grown plants, such as hashish or marijuana. The utility of two different vector text representations is examined: the simple bag of words representation and the more refined embedded global vectors one. While the former is found to work well for the multinomial naive Bayes algorithm, the latter turns out more useful for other classification algorithms: logistic regression, SVMs, and random forests. The obtained results suggest that post-classification can be applied for measuring publication intensity of particular topics and, in the case of forums related to psychoactive substances, for monitoring the risk of drug-related crime.
Źródło:: International Journal of Applied Mathematics and Computer Science; 2018, 28, 4; 787-801
1641-876X
2083-8492
Pojawia się w:: International Journal of Applied Mathematics and Computer Science
Dostawca treści:: Biblioteka Nauki

Artykuł

Zmień widok

na półce

Skocz do pozycji: 4.

Tytuł:: Responsibilities of project managers. A text mining analysis of job advertisements
Autorzy:: Wyskwarski, Marcin
Powiązania:: https://bibliotekanauki.pl/articles/27313470.pdf
Data publikacji:: 2022
Wydawca:: Politechnika Śląska. Wydawnictwo Politechniki Śląskiej
Tematy:: text mining
duties and responsibilities
project manager
eksploracja tekstu
zadania i obowiązki
menedżer projektu
Opis:: Purpose: To identify the duties and responsibilities of project managers by analysing the content of online job advertisements. Design/methodology/approach: Job advertisements were automatically downloaded for 63 countries/areas available on Indeed. A text mining analysis of fragments of the advertisements describing the scope of duties was carried out. The text mining analysis included initial text processing, creating corpora of the documents, creating a document-term matrix, and using classic methods derived from data mining. Findings: The research established the most frequently used words and n-grams in job advertisements. They have been presented in the form of figures. The 2-grams are also presented in the form of a net, a directed graph. The LDA algorithm identified abstract topics describing the duties and responsibilities of project managers. The most frequent words, n-grams, and topics identified by the LDA algorithm were used to identify the duties and responsibilities of project managers. Research limitations/implications: Only job advertisements written in English were analysed. The postings were downloaded only for six days. An attempt to automatically identify the responsibilities section did not yield the expected results. Therefore, it was carried out manually for random advertisements, which reduced the number of analysed documents. The content of the job advertisements was not analysed by country/area. Practical implications: The method applied can be used by organisations training future project managers, to modify and better adapt curricula to the needs of the labour market. Originality/value: Studies have shown that text mining of job advertisements can help determine the duties and responsibilities of project managers.
Źródło:: Zeszyty Naukowe. Organizacja i Zarządzanie / Politechnika Śląska; 2022, 161; 325--348
1641-3466
Pojawia się w:: Zeszyty Naukowe. Organizacja i Zarządzanie / Politechnika Śląska
Dostawca treści:: Biblioteka Nauki

Artykuł

Zmień widok

na półce

Skocz do pozycji: 5.

Tytuł:: The role of word and n-gram frequency analysis in inference of the content of scientific publication
Autorzy:: Zdonek, Iwona
Powiązania:: https://bibliotekanauki.pl/articles/1931609.pdf
Data publikacji:: 2020
Wydawca:: Politechnika Śląska. Wydawnictwo Politechniki Śląskiej
Tematy:: text mining
R
n-grams
scientific publication analysis
eksploracja tekstu
n-gram
analiza publikacji naukowych
Opis:: Purpose: The paper presents an analysis of a scientific publication with regard to the frequency of words and n-grams. The research problem addressed was the question to what extent the text mining analysis of a scientific publication will allow to infer its content. Design/methodology/approach: The main research method is the analysis of tokenized text using word count functions, bigrams, and trigrams in selected sections of a scientific publication. The results of text mining analysis were compared with the classic, non-automated text analysis of the publication. The presented study is a pilot project in the form of a case study. Findings: The proposed method of analyzing a scientific text using an analysis of the frequency of words and n-grams enables inference of the content of the paper with regard to the names of variables involved in the study, the statistical apparatus used and the key literature cited. It should be observed, however, that the discussed method does not make it possible to establish which variables are moderators and which are mediators. Originality/value: In this paper, the text mining technique was used differently in the discussed study than in previous works. The publication was not examined in its entirety, as previous researchers did, but text mining analysis was applied to individual parts of the paper, i.e. the part discussing theoretical foundations of the research and the part presenting the research method, research results, and their discussion. This allowed for obtaining more precise results regarding the content of the publication.
Źródło:: Zeszyty Naukowe. Organizacja i Zarządzanie / Politechnika Śląska; 2020, 142; 21-31
1641-3466
Pojawia się w:: Zeszyty Naukowe. Organizacja i Zarządzanie / Politechnika Śląska
Dostawca treści:: Biblioteka Nauki

Artykuł

Zmień widok

na półce

Skocz do pozycji: 6.

Tytuł:: Analiza dokonań OPP prezentowanych w ich rocznych obligatoryjnych sprawozdaniach z działalności z wykorzystaniem metody eksploracji tekstu
Applying Text Mining to Analyze the Performance of PBOs on the Basis of Their Obligatory Annual Activity Statements
Autorzy:: Dyczkowski, Tomasz
Powiązania:: https://bibliotekanauki.pl/articles/525528.pdf
Data publikacji:: 2016-11-30
Wydawca:: Uniwersytet Warszawski. Wydawnictwo Naukowe Wydziału Zarządzania
Tematy:: dokonania
eksploracja tekstu
informacje opisowe
organizacje pożytku publicznego
performance
text mining
narrative information
public benefit organizations
Opis:: Niniejsze opracowanie ma na celu zbadanie, czy szczegółowość i dobór informacji dotyczących dokonań organizacji pożytku publicznego (OPP) ujawnianych w ich obligatoryjnych rocznych sprawozdaniach z działalności może stymulować ofiarność indywidualnych darczyńców. Badanie przeprowadzono na losowej próbie 177 polskich OPP z zastosowaniem metod eksploracji tekstu oraz eksperymentu laboratoryjnego. Uzyskane wyniki pozwoliły na zidentyfikowanie dziewięciu głównych grup zagadnień prezentowanych przez OPP w narratywnej części sprawozdania rocznego z działalności. Wskazały także na większą szczegółowość i przesunięte akcenty w opisach dokonań tych OPP, które w najwyższym stopniu stymulują darczyńców do dokonywania odpisów 1% podatku.
The paper’s aims is to investigate if the level of detail and selection of particular performance-related information by public benefit organizations (PBOs) in their obligatory annual activity statements can stimulate individual donations. The research encompassed 177 randomly selected Polish PBOs. It applied text mining methodologies and a laboratory experiment. The results obtained allowed the identification of nine key groups of topics that PBOs focus on the narrative parts of their obligatory annual activity statements. Moreover, it was proven that organizations that potential donors would like to support with their 1% tax write-off discuss their performance in more detail and focus on slightly different issues than other organizations.
Źródło:: Problemy Zarządzania; 2016, 4/2016 (63), t.1; 123 - 138
1644-9584
Pojawia się w:: Problemy Zarządzania
Dostawca treści:: Biblioteka Nauki

Artykuł

Zmień widok

na półce

Skocz do pozycji: 7.

Tytuł:: Identification of desired project manager competence using text mining analysis
Autorzy:: Wyskwarski, Marcin
Powiązania:: https://bibliotekanauki.pl/articles/1845057.pdf
Data publikacji:: 2020
Wydawca:: Politechnika Śląska. Wydawnictwo Politechniki Śląskiej
Tematy:: text mining
competencies
project manager
word cloud
topic modeling
eksploracja tekstu
kompetencje
kierownik projektu
chmura słów
modelowanie tematyczne
Opis:: Purpose: An attempt to identify the competencies of the project manager desired by the employers and to determine whether changes have occurred over time. Design/methodology/approach: Job offers were automatically downloaded from website with job offers. An analysis of text mining of fragments of offers describing the competence was carried out. The analysis of text mining included initial text processing, creation of corpora of analyzed documents, creation of a document-term matrix, topic modeling algorithm and the use of classic methods derived from data mining. Findings: The most frequently used words/n-grams and the correlation of selected words/ n-grams with other words/n-grams were presented in the form of drawings. Based on the frequency of words/n-grams and the correlation value, efforts were made to identify the project manager competencies. The topic modeling algorithm was used to generate topics that can also be used to identify expected project manager competencies. Research limitations/implications: Only offers written in Polish, downloaded from one websites with job offers, which had the phrase “kierownik projektu” (“project manager”) in their job title, were analyzed. Data was collected from 09 to 11 April 2018 and from 09 to 11 April 2019. Practical implications: The method applied can be used by organizations preparing for the profession of a project manager, to modify and better adapt curricula to the needs of the labor market. Originality/value: Studies have shown that text mining of job offers can, to some extent, help determine the desired project manager competence.
Źródło:: Zeszyty Naukowe. Organizacja i Zarządzanie / Politechnika Śląska; 2020, 149; 735-749
1641-3466
Pojawia się w:: Zeszyty Naukowe. Organizacja i Zarządzanie / Politechnika Śląska
Dostawca treści:: Biblioteka Nauki

Artykuł

Zmień widok

na półce

Skocz do pozycji: 8.

Tytuł:: Text mining in the identification of duties and responsibilities of the project manager
Autorzy:: Wyskwarski, Marcin
Powiązania:: https://bibliotekanauki.pl/articles/1882989.pdf
Data publikacji:: 2020
Wydawca:: Politechnika Śląska. Wydawnictwo Politechniki Śląskiej
Tematy:: text mining
duties
responsibilities
project manager
word cloud
topic modeling
eksploracja tekstu
obowiązki
odpowiedzialność
menedżer projektu
chmura słów
modelowanie tematyczne
Opis:: Purpose: An attempt to identify the duties and responsibilities of the project manager by analysing job offers from a job website. An attempt to determine whether there were any changes between 2018 and 2019. Design/methodology/approach: Text mining was performed for fragments of job offers, describing the duties and responsibilities. The text mining analysis consisted of initial processing of the text, creation of a corpus of analysed documents, construction of a word frequency matrix and use of classical methods from the data mining are. Findings: The most common words in job offers are presented, as well as their correlation with other words. With the use of the Topic modeling algorithm, hidden topics describing the analysed job offers have been generated. These topics can also be used to identify the duties and responsibilities of a project manager. Research limitations/implications: Only the job offers meeting the following conditions were analysed: (1) they concerned the job of „project manager”; (2) the content was in Polish; (3) they were provided by www.pracuj.pl website; (4) they were collected from 09 to 11 April in 2018 and 2019. Practical implications: This method can be used by organizations training project managers, in order to modify and better adjust the curriculum to the needs of the labour market. Originality/value: Research has shown that text mining can be used to determine the responsibilities of a project manager by analysing job offers.
Źródło:: Zeszyty Naukowe. Organizacja i Zarządzanie / Politechnika Śląska; 2020, 144; 649-659
1641-3466
Pojawia się w:: Zeszyty Naukowe. Organizacja i Zarządzanie / Politechnika Śląska
Dostawca treści:: Biblioteka Nauki

Artykuł

Zmień widok

na półce

Skocz do pozycji: 9.

Tytuł:: Identification of technologies in Industry 4.0 with the use of text mining
Autorzy:: Zdonek, Dariusz
Powiązania:: https://bibliotekanauki.pl/articles/1931589.pdf
Data publikacji:: 2020
Wydawca:: Politechnika Śląska. Wydawnictwo Politechniki Śląskiej
Tematy:: text mining
Industry 4.0
information and communication technology
scientific paper
eksploracja tekstu
Przemysł 4.0
technologie informacyjne i komunikacyjne
praca naukowa
Opis:: Purpose: The objective of this paper is to identify leading technologies in Industry 4.0. Design/methodology/approach: The identification was made with the use of text mining to explore the scientific texts in this field. Assumptions of own iterative method for analyzing scientific texts were proposed, with the use of R language, tokenization, lemmatization, n-grams and correspondence analysis. The assumptions of the proposed method were used to analyze the 40 most often quoted articles indexed in the Web of Science. Findings: On the basis of the obtained results, 4 leading technologies were identified. These are Cloud Computing, Internet of Things, Cyber-physical System and Big Data. Originality/value: The article proposes an original method of identifying the leading technologies used in Industry 4.0. The proposed method is based on text mining and correspondence analysis.
Źródło:: Zeszyty Naukowe. Organizacja i Zarządzanie / Politechnika Śląska; 2020, 142; 45-57
1641-3466
Pojawia się w:: Zeszyty Naukowe. Organizacja i Zarządzanie / Politechnika Śląska
Dostawca treści:: Biblioteka Nauki

Artykuł

Zmień widok

na półce

Skocz do pozycji: 10.

Tytuł:: What experiences do tourists seek in national parks? Analysis of TripAdvisor reviews
Jakich doświadczeń poszukują turyści w parkach narodowych? Analiza opinii w serwisie TripAdvisor
Autorzy:: Nowacki, Marek
Niezgoda, Agnieszka
Powiązania:: https://bibliotekanauki.pl/articles/24201139.pdf
Data publikacji:: 2023
Wydawca:: Fundacja Ekonomistów Środowiska i Zasobów Naturalnych
Tematy:: content analysis
coding experiences
text mining
Poland’s national park
sustainable tourism
analiza treści
kodowanie doświadczeń
eksploracja tekstu
polski park narodowy
zrównoważona turystyka
Opis:: The article aims to analyse and compare experiences gained by tourists visiting three national parks in Poland. The authors focused on the following questions: What are people's experiences visiting national parks in Poland? Do the natural assets of the national parks affect visitors' unique experiences, or are environmentally valuable areas not crucial for their experiences? The authors used mixed quantitative (text mining, correspondence analysis) and qualitative (content analysis) methods. The data for analysis were opinions written by TripAdvisor users. Reviews on TripAdvisor indicate that the most important experiences for tourists in the National Parks studied were Nature appreciation and Physical activity. The other groups of experiences reflected in the reviews were: Aesthetic, Connection, Tension and Excitement. This confirms that nature is the most important feature of national parks for tourists, but it also indicates a trend to maintain good health and the desire to regenerate physical strength in areas of natural beauty.
Celem artykułu jest analiza i porównanie doświadczeń turystów odwiedzających trzy parki narodowe w Polsce. Autorzy skupili się na następujących pytaniach: Jakie są doświadczenia osób odwiedzających parki narodowe w Polsce? Czy walory przyrodnicze parków narodowych mają wpływ na ich unikalne doświadczenia, czy też obszary cenne przyrodniczo nie są dla nich ważne? Autorzy zastosowali mieszane metody: ilościowe (text mining, analiza korespondencji) i jakościowe (analiza treści). Dane do analizy stanowiły opinie użytkowników serwisu TripAdvisor. Analiza recenzji z TripAdvisora wskazała, że najważniejszymi doświadczeniami turystów uzyskanymi w badanych parkach narodowych było: docenianie przyrody i aktywność fizyczna. Pozostałe grupy doświadczeń odzwierciedlone w recenzjach to: estetyczne, kontaktów, napięcia i ekscytacji. Stwierdzono, że przyroda jest dla turystów najważniejszym walorem parków narodowych. Badania wskazały także na tendencję wśród turystów do dbania o zdrowie i chęć regeneracji sił fizycznych na terenach o wybitnych walorach przyrodniczych.
Źródło:: Ekonomia i Środowisko; 2023, 1; 341--359
0867-8898
Pojawia się w:: Ekonomia i Środowisko
Dostawca treści:: Biblioteka Nauki

Artykuł

Zmień widok

na półce

Skocz do pozycji: 11.

Tytuł:: Propozycja mieszanego przetwarzania półstrukturalnego modelu opisu zdarzeń z akcji ratowniczo-gaśniczych Państwowej Straży Pożarnej PSP3
Proposition of hybrid process model semi structured description of event from fire services rescues operation
Autorzy:: Mirończuk, M.
Maciak, T.
Powiązania:: https://bibliotekanauki.pl/articles/373949.pdf
Data publikacji:: 2013
Wydawca:: Centrum Naukowo-Badawcze Ochrony Przeciwpożarowej im. Józefa Tuliszkowskiego
Tematy:: eksploracja tekstu
klasyfikator Bayesa
naiwny klasyfikator Bayesa
ontologia służb ratowniczych
reprezentacja meldunków
reprezentacja przypadków zdarzeń
reprezentacja tekstu
wnioskowanie na podstawie przypadków
Bayes classifier
casebased reasoning
naive Bayes classifier
ontology for rescue service
representation of reports
text mining
text representation
Opis:: W opracowaniu przedstawiono aktualnie rozwijane reprezentacje wiedzy i sposoby opisów zdarzeń, dla systemu wnioskowania na podstawie przypadków zdarzeń służb ratowniczych Państwowej Straży Pożarnej PSP. W artykule zaproponowano sposób ich przetwarzania. Przedstawiony sposób bazuje na klasyfikacji i wyszukiwaniu opisów zdarzeń.
This paper describes a review of actual developed knowledge representation and case representation for fire services cases based reasoning system. The article also describes a method of processing the cases of events. This processing method based on classification and information retrieval.
Źródło:: Bezpieczeństwo i Technika Pożarnicza; 2013, 1; 95-106
1895-8443
Pojawia się w:: Bezpieczeństwo i Technika Pożarnicza
Dostawca treści:: Biblioteka Nauki

Artykuł

Zmień widok

na półce

Skocz do pozycji: 12.

Tytuł:: New algorithm for determining the number of features for the effective sentiment-classification of text documents
Nowy algorytm ustalania liczby zmiennych potrzebnych do klasyfikacji dokumentów tekstowych ze względu na ich wydźwięk emocjonalny
Autorzy:: Idczak, Adam
Korzeniewski, Jerzy
Powiązania:: https://bibliotekanauki.pl/articles/18105028.pdf
Data publikacji:: 2023-05-31
Wydawca:: Główny Urząd Statystyczny
Tematy:: sentiment analysis
document sentiment classification
text mining
logistic regression
naive Bayes classifier
feature selection
correlation
analiza sentymentu
klasyfikacja dokumentów ze względu na wydźwięk emocjonalny
eksploracja tekstu
regresja logistyczna
naiwny klasyfikator Bayesa
dobór cech
korelacja
Opis:: Sentiment analysis of text documents is a very important part of contemporary text mining. The purpose of this article is to present a new technique of text sentiment analysis which can be used with any type of a document-sentiment-classification method. The proposed technique involves feature selection independently of a classifier, which reduces the size of the feature space. Its advantages include intuitiveness and computational noncomplexity. The most important element of the proposed technique is a novel algorithm for the determination of the number of features to be selected sufficient for the effective classification. The algorithm is based on the analysis of the correlation between single features and document labels. A statistical approach, featuring a naive Bayes classifier and logistic regression, was employed to verify the usefulness of the proposed technique. They were applied to three document sets composed of 1,169 opinions of bank clients, obtained in 2020 from a Poland-based bank. The documents were written in Polish. The research demonstrated that reducing the number of terms over 10-fold by means of the proposed algorithm in most cases improves the effectiveness of classification.
Analiza sentymentu, czyli wydźwięku emocjonalnego, dokumentów tekstowych stanowi bardzo ważną część współczesnej eksploracji tekstu (ang. text mining). Celem artykułu jest przedstawienie nowej techniki analizy sentymentu tekstu, która może znaleźć zastosowanie w dowolnej metodzie klasyfikacji dokumentów ze względu na ich wydźwięk emocjonalny. Proponowana technika polega na niezależnym od klasyfikatora doborze cech, co skutkuje zmniejszeniem rozmiaru ich przestrzeni. Zaletami tej propozycji są intuicyjność i prostota obliczeniowa. Zasadniczym elementem omawianej techniki jest nowatorski algorytm ustalania liczby terminów wystarczających do efektywnej klasyfikacji, który opiera się na analizie korelacji pomiędzy pojedynczymi cechami dokumentów a ich wydźwiękiem. W celu weryfikacji przydatności proponowanej techniki zastosowano podejście statystyczne. Wykorzystano dwie metody: naiwny klasyfikator Bayesa i regresję logistyczną. Za ich pomocą zbadano trzy zbiory dokumentów składające się z 1169 opinii klientów jednego z banków działających na terenie Polski uzyskanych w 2020 r. Dokumenty zostały napisane w języku polskim. Badanie pokazało, że kilkunastokrotne zmniejszenie liczby terminów przy zastosowaniu proponowanej techniki na ogół poprawia jakość klasyfikacji.
Źródło:: Wiadomości Statystyczne. The Polish Statistician; 2023, 68, 5; 40-57
0043-518X
Pojawia się w:: Wiadomości Statystyczne. The Polish Statistician
Dostawca treści:: Biblioteka Nauki

Artykuł

Zmień widok

na półce

Informacja

Wyszukujesz frazę "text mining" wg kryterium: Temat

Źródło danych

Dostawca treści

Kolekcja

Rok wydania

Wydawca

Temat

Autor

Typ dokumentu

Język