Informacja

Drogi użytkowniku, aplikacja do prawidłowego działania wymaga obsługi JavaScript. Proszę włącz obsługę JavaScript w Twojej przeglądarce.

Wyszukujesz frazę "dataset" wg kryterium: Temat


Tytuł:
Phylogenetic Characters in the Humerus and Tarsometatarsus of Penguins
Autorzy:
Hoffmeister, Martín Chávez
Powiązania:
https://bibliotekanauki.pl/articles/2051142.pdf
Data publikacji:
2014
Wydawca:
Polska Akademia Nauk. Czytelnia Czasopism PAN
Tematy:
Sphenisciformes
limb bones
phylogenetic analysis
parsimony method
revised dataset
Źródło:
Polish Polar Research; 2014, 3; 469-496
0138-0338
2081-8262
Pojawia się w:
Polish Polar Research
Dostawca treści:
Biblioteka Nauki
Artykuł
Tytuł:
Novel diabetes classification approach based on CNN-LSTM: enhanced performance and accuracy
Autorzy:
Ayat, Yassine
Benzekri, Wiame
El Moussati, Ali
Mir, Ismail
Benzaouia, Mohammed
El Aouni, Abdelaziz
Powiązania:
https://bibliotekanauki.pl/articles/31341646.pdf
Data publikacji:
2024
Wydawca:
Polska Akademia Nauk. Polskie Towarzystwo Diagnostyki Technicznej PAN
Tematy:
diabetes
diabetes classification
dataset balancing
combined model
personalized healthcare
Opis:
This paper deals with the development of an approach for diabetes classification harnessing ConvolutionalNeural-network (CNN) and a Long-Short-Term-Memory (LSTM) model. The proposed method harnesses the strengths of LSTM and CNN architectures to effectively capture sequential patterns and extract meaningful features from the input data. A comprehensive dataset containing relevant features for diabetes patients is used to train and evaluate the classifiers. Evaluation metrics such as kappa score, F1-score, accuracy, precision, and recall are employed in ordre to assess the performance of each model. The results demonstrate that the CNNLSTM model outperforms other models, including Logistic Regression, Random Forest, SVM, and KNN, achieving an impressive accuracy of 97%. These findings shed light on the effectiveness of the proposed approach in accurately classifying diabetes, resulting in significant advancement in diabetes diagnosis and treatment and opening up exciting possibilities for personalized healthcare.
Źródło:
Diagnostyka; 2024, 25, 1; art. no. 2024112
1641-6414
2449-5220
Pojawia się w:
Diagnostyka
Dostawca treści:
Biblioteka Nauki
Artykuł
Tytuł:
An Empirical Study on the Factors Affecting Software Development Productivity
Autorzy:
Lavazza, L.
Morasca, S.
Tosi, D.
Powiązania:
https://bibliotekanauki.pl/articles/384131.pdf
Data publikacji:
2018
Wydawca:
Politechnika Wrocławska. Oficyna Wydawnicza Politechniki Wrocławskiej
Tematy:
effort
function point
empirical study
ISBSG dataset
factors
development
productivity
Opis:
Background : Software development productivity is widely investigated in the Software Engineering literature. However, continuously updated evidence on productivity is constantly needed, due to the rapid evolution of software development techniques and methods, and also the regular improvement in the use of the existing ones. Objectives : The main goal of this paper is to investigate which factors affect productivity. It was also investigated whether economies or diseconomies of scale exist and whether they may be influenced by productivity factors. Method : An empirical investigation was carried out using a dataset available at the software project repository ISBSG. The major focus was on factors that may affect productivity from a functional point of view. The the conducted analysis was compared with the productivity data provided by Capers Jones in 1996 and 2013 and with an investigation on open-source software by Delorey et al. Results : This empirical study led to the discovery of interesting models that show how the different factors do (or do not) affect productivity. It was also found out that some factors appear to allow for economies of scale, while others appear to cause diseconomies of scale. Conclusions : This paper provides some more evidence about how four factors, i.e., programming languages, business areas, architectural types, and the usage of CASE tools, influence productivity and highlights some interesting divergences in comparison with the results reported by Capers Jones and Delorey et al.
Źródło:
e-Informatica Software Engineering Journal; 2018, 12, 1; 27-49
1897-7979
Pojawia się w:
e-Informatica Software Engineering Journal
Dostawca treści:
Biblioteka Nauki
Artykuł
Tytuł:
Prediction of Missing Values in Adult Data Set of UCI Machine Learning : A Case of Study
Autorzy:
Luna, Alejandra
Bello, Mario
Hernandez, Ana
Bonilla, Edmundo
Powiązania:
https://bibliotekanauki.pl/articles/1397481.pdf
Data publikacji:
2020
Wydawca:
Warszawska Wyższa Szkoła Informatyki
Tematy:
Shannon theory
entropy
missing attributes
adult dataset
UCI Machine Learning
Opis:
These days, not having complete data of any kind can be a big problem for different organizations when making decisions. In this article, we propose to use Shannon entropy and information gain to predict and impute missing categorical data in any data set. It is detailed with an example of how entropy is applied and knows the level of uncertainty of each attribute value. Likewise, the imputation of the missing attributes is also carried out with other imputation techniques in the Adult data set of UCI Machine Learning to denote the advantages offered by the proposed methodology.
Źródło:
Zeszyty Naukowe Warszawskiej Wyższej Szkoły Informatyki; 2020, 14, 22; 7-21
1896-396X
2082-8349
Pojawia się w:
Zeszyty Naukowe Warszawskiej Wyższej Szkoły Informatyki
Dostawca treści:
Biblioteka Nauki
Artykuł
Tytuł:
Forest species mapping using airborne hyperspectral APEX data
Autorzy:
Tagliabue, Giulia
Panigada, Cinzia
Colombo, Roberto
Fava, Francesco
Cilia, Chiara
Baret, Frédéric
Vreys, Kristin
Meuleman, Koen
Rossini, Micol
Powiązania:
https://bibliotekanauki.pl/articles/1035947.pdf
Data publikacji:
2016
Wydawca:
Uniwersytet Warszawski. Wydział Geografii i Studiów Regionalnych
Tematy:
Vegetation map
Hyperspectral
Aerial
Supervised classification
Multi-temporal dataset
Forest ecosystem
Opis:
The accurate mapping of forest species is a very important task in relation to the increasing need to better understand the role of the forest ecosystem within environmental dynamics. The objective of this paper is the investigation of the potential of a multi-temporal hyperspectral dataset for the production of a thematic map of the dominant species in the Forêt de Hardt (France). Hyperspectral data were collected in June and September 2013 using the Airborne Prism EXperiment (APEX) sensor, covering the visible, near-infrared and shortwave infrared spectral regions with a spatial resolution of 3 m by 3 m. The map was realized by means of a maximum likelihood supervised classification. The classification was first performed separately on images from June and September and then on the two images together. Class discrimination was performed using as input 3 spectral indices computed as ratios between red edge bands and a blue band for each image. The map was validated using a testing set selected on the basis of a random stratified sampling scheme. Results showed that the algorithm performances improved from an overall accuracy of 59.5% and 48% (for the June and September images, respectively) to an overall accuracy of 74.4%, with the producer’s accuracy ranging from 60% to 86% and user’s accuracy ranging from 61% to 90%, when both images (June and September) were combined. This study demonstrates that the use of multi-temporal high-resolution images acquired in two different vegetation development stages (i.e., 17 June 2013 and 4 September 2013) allows accurate (overall accuracy 74.4%) local-scale thematic products to be obtained in an operational way.
Źródło:
Miscellanea Geographica. Regional Studies on Development; 2016, 20, 1; 28-33
0867-6046
2084-6118
Pojawia się w:
Miscellanea Geographica. Regional Studies on Development
Dostawca treści:
Biblioteka Nauki
Artykuł
Tytuł:
Intrusion Detection in Software Defined Networks with Self-organized Maps
Autorzy:
Jankowski, D.
Amanowicz, M.
Powiązania:
https://bibliotekanauki.pl/articles/308109.pdf
Data publikacji:
2015
Wydawca:
Instytut Łączności - Państwowy Instytut Badawczy
Tematy:
IDS dataset
machine learning
metasploit
network security
network simulation
OpenFlow
virtualization
Opis:
The Software Defined Network (SDN) architecture provides new opportunities to implement security mechanisms in terms of unauthorized activities detection. At the same time, there are certain risks associated with this technology. The presented approach covers a conception of the measurement method, virtual testbed and classification mechanism for SDNs. The paper presents a measurement method which allows collecting network traffic flow parameters, generated by a virtual SDN environment. The collected dataset can be used in machine learning methods to detect unauthorized activities.
Źródło:
Journal of Telecommunications and Information Technology; 2015, 4; 3-9
1509-4553
1899-8852
Pojawia się w:
Journal of Telecommunications and Information Technology
Dostawca treści:
Biblioteka Nauki
Artykuł
Tytuł:
Person re-identification accuracy improvement by training a CNN with the new large joint dataset and re-rank
Autorzy:
Bohush, Rykhard
Ihnatsyeva, Sviatlana
Ablameyko, Sergey
Powiązania:
https://bibliotekanauki.pl/articles/2201263.pdf
Data publikacji:
2022
Wydawca:
Szkoła Główna Gospodarstwa Wiejskiego w Warszawie. Instytut Informatyki Technicznej
Tematy:
convolution neural network
PolReID
re-identification
large-scale dataset
re-rank
Opis:
The paper is aimed to improve person re-identification accuracy in distributed video surveillance systems based on constructing a large joint image dataset of people for training convolutional neural networks (CNN). For this aim, an analysis of existing datasets is provided. Then, a new large joint dataset for person re-identification task is constructed that includes the existing public datasets CUHK02, CUHK03, Market, Duke, MSMT17 and PolReID. Testing for re-identification is performed for such frequently cited CNNs as ResNet-50, DenseNet121 and PCB. Re-identification accuracy is evaluated by using the main metrics Rank, mAP and mINP. The use of the new large joint dataset makes it possible to improve Rank1 mAP, mINP on all test sets. Re-ranking is used to further increase the re-identification accuracy. Presented results confirm the effectiveness of the proposed approach.
Źródło:
Machine Graphics & Vision; 2022, 31, 1/4; 93--109
1230-0535
2720-250X
Pojawia się w:
Machine Graphics & Vision
Dostawca treści:
Biblioteka Nauki
Artykuł
Tytuł:
The influence of the North Atlantic Oscillation on the potential distribution areas of Bursaphelencus xylophilus in Europe based on climatological reanalysis data
Autorzy:
Somfalvi-Toth, K.
Keszthelyi, S.
Powiązania:
https://bibliotekanauki.pl/articles/2082798.pdf
Data publikacji:
2020
Wydawca:
Polska Akademia Nauk. Czytelnia Czasopism PAN
Tematy:
Bursaphelencus xylophilus
distribution
ECMWF ERA5 reanalysis dataset
NAO
pine wood nematode
temperature
Opis:
Pine wood nematode (Bursaphelenchus xylophilus) (Aphelenchida: Parasitaphelencidae) is one of the most harmful agents in coniferous forests. The most important vectors of pine wood nematode are considered to be some Monochamus species (Col.: Cerambycidae), which had been forest insects with secondary importance before the appearance of B. xy- lophilus. However, the continuous spreading of the nematode has changed this status and necessitated detailed biological and climatological investigation of the main European vec- tor, Monochamus galloprovincialis. The potential distribution area of M. galloprovincialis involves those areas where the risk of the appearance of pine wood nematode B. xylophilus is significant. The main objective of our analysis was to obtain information about the in- fluencing effects of North Atlantic Oscillation (NAO) on the potential European range of B. xylophilus and its vector species M. galloprovincialis based on the connection between the mean temperature of July in Europe, the distribution of day-degrees of the vector and the NAO index. Our assessment was based on fundamental biological constants of the nematode and the cerambycid pest as well as the ECMWF ERA5 Global Atmospheric Rea- nalysis dataset. Our hypothesis was built on the fact that the monthly mean temperature had to exceed 20°C in the interest of an efficient expansion of the nematode. In addition, the threshold temperature of the vector involved in the calculations was 12.17°C, while the accumulated day-degree (DD) had to exceed the annual and biennial 370.57°DD for univoltine and semivoltine development, respectively. Our finding that a connection could be found between a mean temperature in July above 20°C and NAO as well as between the accumulated day-degrees and NAO can be the basis for further investigations for a reliable method to forecast the expansion of pine wood nematode and its vector species in a given year.
Źródło:
Journal of Plant Protection Research; 2020, 60, 2; 215-219
1427-4345
Pojawia się w:
Journal of Plant Protection Research
Dostawca treści:
Biblioteka Nauki
Artykuł
Tytuł:
Deep Learning Can Improve Early Skin Cancer Detection
Autorzy:
Mohamed, Abeer
Mohamed, Wael A.
Zekry, Abdel Halim
Powiązania:
https://bibliotekanauki.pl/articles/963798.pdf
Data publikacji:
2019
Wydawca:
Polska Akademia Nauk. Czytelnia Czasopism PAN
Tematy:
technology
dermoscopic lesions
convolutional
neural network
ISIC dataset
deep learning
neural networks
Opis:
Skin cancer is the most common form of cancer affecting humans. Melanoma is the most dangerous type of skin cancer; and early diagnosis is extremely vital in curing the disease. So far, the human knowledge in this field is very limited, thus, developing a mechanism capable of identifying the disease early on can save lives, reduce intervention and cut unnecessary costs. In this paper, the researchers developed a new learning technique to classify skin lesions, with the purpose of observing and identifying the presence of melanoma. This new technique is based on a convolutional neural network solution with multiple configurations; where the researchers employed an International Skin Imaging Collaboration (ISIC) dataset. Optimal results are achieved through a convolutional neural network composed of 14 layers. This proposed system can successfully and reliably predict the correct classification of dermoscopic lesions with 97.78% accuracy.
Źródło:
International Journal of Electronics and Telecommunications; 2019, 65, 3; 507-512
2300-1933
Pojawia się w:
International Journal of Electronics and Telecommunications
Dostawca treści:
Biblioteka Nauki
Artykuł
Tytuł:
A cross-linguistic database of phonetic transcription systems
Autorzy:
Anderson, Cormac
Tresoldi, Tiago
Chacon, Thiago
Fehn, Anne-Maria
Walworth, Mary
Forkel, Robert
List, Johann-Mattis
Powiązania:
https://bibliotekanauki.pl/articles/2134801.pdf
Data publikacji:
2018-12-01
Wydawca:
Uniwersytet im. Adama Mickiewicza w Poznaniu
Tematy:
phonetic transcription
phoneme inventory databases
cross-linguistically linked data
reference catalog
dataset
Opis:
Contrary to what non-practitioners might expect, the systems of phonetic notation used by linguists are highly idiosyncratic. Not only do various linguistic subfields disagree on the specific symbols they use to denote the speech sounds of languages, but also in large databases of sound inventories considerable variation can be found. Inspired by recent efforts to link cross-linguistic data with help of reference catalogues (Glottolog, Concepticon) across different resources, we present initial efforts to link different phonetic notation systems to a catalogue of speech sounds. This is achieved with the help of a database accompanied by a software framework that uses a limited but easily extendable set of non-binary feature values to allow for quick and convenient registration of different transcription systems, while at the same time linking to additional datasets with restricted inventories. Linking different transcription systems enables us to conveniently translate between different phonetic transcription systems, while linking sounds to databases allows users quick access to various kinds of metadata, including feature values, statistics on phoneme inventories, and information on prosody and sound classes. In order to prove the feasibility of this enterprise, we supplement an initial version of our cross-linguistic database of phonetic transcription systems (CLTS), which currently registers five transcription systems and links to fifteen datasets, as well as a web application, which permits users to conveniently test the power of the automatic translation across transcription systems.
Źródło:
Yearbook of the Poznań Linguistic Meeting; 2018, 4, 1; 21-53
2449-7525
Pojawia się w:
Yearbook of the Poznań Linguistic Meeting
Dostawca treści:
Biblioteka Nauki
Artykuł
Tytuł:
Ensemble-based Method of Fraud Detection at Self-checkouts in Retail
Autorzy:
Vitynskyi, P.
Tkachenko, R.
Izonin, I.
Powiązania:
https://bibliotekanauki.pl/articles/410756.pdf
Data publikacji:
2019
Wydawca:
Polska Akademia Nauk. Oddział w Lublinie PAN
Tematy:
classification
Ensemble-based method
Random Forest
fraud detection
retail
Ito decomposition
imbalanced dataset
Opis:
The authors consider the problem of fraud detection at self-checkouts in retail in condition of unbalanced data set. A new ensemble-based method is proposed for its effective solution. The developed method involves two main steps: application of the preprocessing procedures and the Random Forest algorithm. The step-by-step implementation of the preprocessing stage involves the sequential execution of such procedures over the input data: scaling by maximal element in a column with row-wise scaling by Euclidean norm, weighting by correlation and applying polynomial extension. For polynomial extension Ito decomposition of the second degree is used. The simulation of the method was carried out on real data. Evaluating performance was based on the use of cost matrix. The experimental comparison of the effectiveness of the developed ensemble-based method with a number of existing (simples and ensembles) demonstrates the best performance of the developed method. Experimental studies of changing the parameters of the Random Forest both for the basic algorithm and for the developed method demonstrate a significant improvement of the investigated efficiency measures of the latter. It is the result of all steps of the preprocessing stage of the developed method use.
Źródło:
ECONTECHMOD : An International Quarterly Journal on Economics of Technology and Modelling Processes; 2019, 8, 2; 3-8
2084-5715
Pojawia się w:
ECONTECHMOD : An International Quarterly Journal on Economics of Technology and Modelling Processes
Dostawca treści:
Biblioteka Nauki
Artykuł
Tytuł:
Bearing fault detection and diagnosis based on densely connected convolutional networks
Autorzy:
Niyongabo, Julius
Zhang, Yingjie
Ndikumagenge, Jérémie
Powiązania:
https://bibliotekanauki.pl/articles/2105995.pdf
Data publikacji:
2022
Wydawca:
Politechnika Białostocka. Oficyna Wydawnicza Politechniki Białostockiej
Tematy:
bearing
deep learning
machine learning
transfer learning
fault detection
fault diagnosis
CWRU dataset
Opis:
Rotating machines are widely used in today’s world. As these machines perform the biggest tasks in industries, faults are naturally observed on their components. For most rotating machines such as wind turbine, bearing is one of critical components. To reduce failure rate and increase working life of rotating machinery it is important to detect and diagnose early faults in this most vulner-able part. In the recent past, technologies based on computational intelligence, including machine learning (ML) and deep learning (DL), have been efficiently used for detection and diagnosis of bearing faults. However, DL algorithms are being increasingly favoured day by day because of their advantages of automatically extracting features from training data. Despite this, in DL, adding neural layers reduces the training accuracy and the vanishing gradient problem arises. DL algorithms based on convolutional neural networks (CNN) such as DenseNet have proved to be quite efficient in solving this kind of problem. In this paper, a transfer learning consisting of fine-tuning DenseNet-121 top layers is proposed to make this classifier more robust and efficient. Then, a new intelligent model inspired by DenseNet-121 is designed and used for detecting and diagnosing bearing faults. Continuous wavelet transform is applied to enhance the dataset. Experimental results obtained from analyses employing the Case Western Reserve University (CWRU) bearing dataset show that the proposed model has higher diagnostic performance, with 98% average accuracy and less complexity.
Źródło:
Acta Mechanica et Automatica; 2022, 16, 2; 130--135
1898-4088
2300-5319
Pojawia się w:
Acta Mechanica et Automatica
Dostawca treści:
Biblioteka Nauki
Artykuł
Tytuł:
Wykrywanie zagrożenia upadłością jako problem klasyfikacji danych niezbalansowanych
Bankruptcy prediction as imbalanced classification problem
Autorzy:
Paliński, Andrzej
Powiązania:
https://bibliotekanauki.pl/articles/2041253.pdf
Data publikacji:
2020
Wydawca:
Uniwersytet Ekonomiczny w Katowicach
Tematy:
Klasyfikacja
Preprocessing
Uczenie maszynowe
Upadłość
Zbiór niezbalansowany
Bankruptcy
Classification
Imbalanced dataset
Machine learning
Opis:
W artykule wykorzystano wybrane algorytmy uczenia maszynowego oraz techniki przygotowania danych (preprocessing) stosowane w klasyfikacji na zbiorach niezbalansowanych w celu oceny ich skuteczności w prognozowaniu upadłości z użyciem danych zawierających wskaźniki finansowe podmiotów gospodarczych. Trafność prognoz upadłości na pierwotnym niezbalansowanym zbiorze danych o przeważającym udziale podmiotów prowadzących działalności nad upadłymi była bliska zero. Trafność prognozowania upadłości klasyfikatorów utworzonych na zbiorach zbalansowanych była odwrotnie proporcjonalna do całkowitej trafności klasyfikacji i wahała się od 10% – dla całkowitej trafności klasyfikacji wynoszącej 93%, do 77% – dla całkowitej trafności klasyfikacji równej 49%. Lepsze wyniki klasyfikacji osiągały algorytmy gradient boosting i drzewo klasyfikacyjne w stosunku do sztucznej sieci neuronowej. W problemie klasyfikacji na zbiorach niezbalansowanych wystąpił efekt wymiany – albo możliwe jest zwiększenie trafności klasyfikacji upadłości kosztem nadmiarowości obiektów kla-syfikowanych jako upadłe, albo – zwiększenie trafności klasyfikacji całkowitej algorytmu kosztem zmniejszenia trafności klasyfikacji samej upadłości.
Selected machine learning algorithms and data preprocessing techniques were used in the article to predict bankruptcy on an unbalanced data set containing financial ratios. The accuracy of bankruptcy forecasts on the original unbalanced data set of the prevailing share of entities still operating over the bankrupt ones was close to zero. The accuracy of bankruptcy forecasting classifiers created on balanced sets ranged from 10% to 77%, but was inversely proportional to the total accuracy of the classification, which ranged from 93% to 49%. Better classification results were achieved by the classification trees algorithms in relation to the artificial neural network. In the problem of classification in unbalanced data sets the effect of substitution occurred – or it is possible to increase the accuracy of classification of bankruptcy at the expense of redundancy of objects classified as bankrupt, or – to increase the accuracy of the overall classification of the algorithm at the expense of decreasing the classification of the bankruptcy itself.
Źródło:
Studia Ekonomiczne; 2020, 395; 66-79
2083-8611
Pojawia się w:
Studia Ekonomiczne
Dostawca treści:
Biblioteka Nauki
Artykuł
Tytuł:
An approach to reliability analysis of aircraft systems for a small dataset
Autorzy:
Okoro, Onyedikachi Chioma
Zaliskyi, Maksym
Serhii, Dmytriiev
Abule, Ibinabo
Powiązania:
https://bibliotekanauki.pl/articles/27311335.pdf
Data publikacji:
2023
Wydawca:
Politechnika Śląska. Wydawnictwo Politechniki Śląskiej
Tematy:
aircraft maintenance
reliability
small dataset
aircraft systems
obsługa techniczna samolotów
niezawodność
mały zbiór danych
systemy lotnicze
Opis:
Data-driven predictive aircraft maintenance approach typically results in lower maintenance costs, avoiding unnecessary preventive maintenance actions and reducing unexpected failures. Information provided by a reliability analysis of aircraft components and systems can improve an existing maintenance strategy and ensure an optimal maintenance task interval. For reliability work, the exponential distribution is typically used; however, this approach requires substantial amounts of data, which often may not be generated by aviation operations. Therefore, this study proposes a method for reliability analysis given a small dataset. Real-life historical data of an aircraft operating in Nigeria validate the proposed approach and prove its applicability.
Źródło:
Zeszyty Naukowe. Transport / Politechnika Śląska; 2023, 118; 207--217
0209-3324
2450-1549
Pojawia się w:
Zeszyty Naukowe. Transport / Politechnika Śląska
Dostawca treści:
Biblioteka Nauki
Artykuł
Tytuł:
Named-entity recognition for Hindi language using context pattern-based maximum entropy
Autorzy:
Jain, Arti
Yadav, Divakar
Arora, Anuja
Tayal, Devendra K.
Powiązania:
https://bibliotekanauki.pl/articles/27312839.pdf
Data publikacji:
2022
Wydawca:
Akademia Górniczo-Hutnicza im. Stanisława Staszica w Krakowie. Wydawnictwo AGH
Tematy:
context patterns
gazetteer lists
Hindi language
Kaggle dataset
maximum entropy
named-entity recognition
feature extension
Opis:
This paper describes a named-entity-recognition (NER) system for the Hindi language that uses two methodologies: an existing baseline maximum entropy-based named-entity (BL-MENE) model, and the proposed context pattern-based MENE (CP-MENE) framework. BL-MENE utilizes several baseline features for the NER task but suffers from inaccurate named-entity (NE) boundary detection, misclassification errors, and the partial recognition of NEs due to certain missing essentials. However, the CP-MENE-based NER task incorporates extensive features and patterns that are set to overcome these problems. In fact, CP-MENE’s features include right-boundary, left-boundary, part-of-speech, synonym, gazetteer and relative pronoun features. CP-MENE formulates a kind of recursive relationship for extracting highly ranked NE patterns that are generated through regular expressions via Python@ code. Since the web content of the Hindi language is arising nowadays (especially in health care applications), this work is conducted on the Hindi health data (HHD) corpus (which is readily available from the Kaggle dataset). Our experiments were conducted on four NE categories; namely, Person (PER), Disease (DIS), Consumable (CNS), and Symptom (SMP).
Źródło:
Computer Science; 2022, 23 (1); 81--115
1508-2806
2300-7036
Pojawia się w:
Computer Science
Dostawca treści:
Biblioteka Nauki
Artykuł

Ta witryna wykorzystuje pliki cookies do przechowywania informacji na Twoim komputerze. Pliki cookies stosujemy w celu świadczenia usług na najwyższym poziomie, w tym w sposób dostosowany do indywidualnych potrzeb. Korzystanie z witryny bez zmiany ustawień dotyczących cookies oznacza, że będą one zamieszczane w Twoim komputerze. W każdym momencie możesz dokonać zmiany ustawień dotyczących cookies