Informacja

Drogi użytkowniku, aplikacja do prawidłowego działania wymaga obsługi JavaScript. Proszę włącz obsługę JavaScript w Twojej przeglądarce.

Wyszukujesz frazę "k-means ++" wg kryterium: Wszystkie pola


Tytuł:
K-means is probabilistically poor
Autorzy:
Kłopotek, Mieczysław
Powiązania:
https://bibliotekanauki.pl/articles/2201613.pdf
Data publikacji:
2022
Wydawca:
Uniwersytet Przyrodniczo-Humanistyczny w Siedlcach
Tematy:
k-means
clustering
probabilistic k-richness
Opis:
Kleinberg introduced the concept of k-richness as a requirement for an algorithm to be a clustering algorithm. The most popular algorithm k means dos not fit this definition because of its probabilistic nature. Hence Ackerman et al. proposed the notion of probabilistic k-richness claiming without proof that k-means has this property. It is proven in this paper, by example, that the version of k-means with random initialization does not have the property probabilistic k-richness, just rebuking Ackeman's claim.
Źródło:
Studia Informatica : systems and information technology; 2022, 2(27); 5--26
1731-2264
Pojawia się w:
Studia Informatica : systems and information technology
Dostawca treści:
Biblioteka Nauki
Artykuł
Tytuł:
Inicjalizacja segmentacji k-means uwzględniająca rozkład gęstości pikseli
Autorzy:
Świta, R.
Suszyński, Z.
Powiązania:
https://bibliotekanauki.pl/articles/118366.pdf
Data publikacji:
2014
Wydawca:
Politechnika Koszalińska. Wydawnictwo Uczelniane
Tematy:
FA
KKZ
k-means
kmeans++
segmentacja
k-means ++
segmentation
High Density
Opis:
Artykuł przedstawia modyfikację inicjalizacji KKZ algorytmu k-means, uwzględniającą, oprócz wzajemnych odległości środków segmentów, również rozkład gęstości pikseli. Funkcja gęstości piksela jest sumą odwrotności odległości piksela od pozostałych i jest poddawana oszacowaniu na podstawie odległości piksela od wartości średniej i wariancji wartości pikseli. W eksperymentach segmentacji podlegały cztery różne sekwencje obrazów termicznych uzyskanych metodą termografii aktywnej. Pomimo dodatkowych obliczeń podczas inicjalizacji, metoda wykazała szybszą zbieżność algorytmu z czasami bardzo podobnymi do inicjalizacji KKZ, ale mniejszym błędem końcowym segmentacji.
This article presents a modification for the KKZ initialization of the k-means segmentation algorithm, which, in addition to the mutual distance of segments, takes into account the density of pixels. Pixel density is expressed asa sum of the inverse of the pixel’s distance to the other pixels and is subjected to estimation based on the distance from the mean and variance of the pixel values. In the experiments, four different sequences of thermal images were used, obtained using active thermography. Despite the additional calculations during initialization, method showed a faster convergence of the algorithm, with processing times very similar to the KKZ initialization, but with a lower final segmentation error.
Źródło:
Zeszyty Naukowe Wydziału Elektroniki i Informatyki Politechniki Koszalińskiej; 2014, 6; 89-98
1897-7421
Pojawia się w:
Zeszyty Naukowe Wydziału Elektroniki i Informatyki Politechniki Koszalińskiej
Dostawca treści:
Biblioteka Nauki
Artykuł
Tytuł:
Extending k-means with the description comes first approach
Autorzy:
Stefanowski, J.
Weiss, D.
Powiązania:
https://bibliotekanauki.pl/articles/970926.pdf
Data publikacji:
2007
Wydawca:
Polska Akademia Nauk. Instytut Badań Systemowych PAN
Tematy:
document clustering
cluster labels
k-means algorithm
information retrieval
Opis:
This paper describes a technique for clustering large collections of short and medium length text documents such as press articles, news stories and the like. The technique called description comes first (DCF) consists of identification of related document clusters, selection of salient phrases relevant to these clusters and reallocation of documents matching the selected phrases to form final document groups. The advantages of this technique include more comprehensive cluster labels and clearer (more transparent) relationship between cluster labels and their content. We demonstrate the DCF by taking a standard k-means algorithm as a baseline and weaving DCF elements into it; the outcome is the descriptive k-means (DKM) algorithm. The paper goes through technical background explaining how to implement DKM efficiently and ends with the description of an experiment measuring clustering quality on a benchmark document collection 20-newsgroups. Short fragments of this paper appeared at the poster session of the RIAO 2007 conference, Pittsburgh, PA, USA (electronic proceedings only).
Źródło:
Control and Cybernetics; 2007, 36, 4; 1009-1035
0324-8569
Pojawia się w:
Control and Cybernetics
Dostawca treści:
Biblioteka Nauki
Artykuł
Tytuł:
K-Means and Fuzzy based Hybrid Clustering Algorithm for WSN
Autorzy:
Angadi, Basavaraj M.
Kakkasageri, Mahabaleshwar S.
Powiązania:
https://bibliotekanauki.pl/articles/27311955.pdf
Data publikacji:
2023
Wydawca:
Polska Akademia Nauk. Czasopisma i Monografie PAN
Tematy:
wireless sensor networks
cluster
K-Means algorithm
fuzzy logic
Opis:
Wireless Sensor Networks (WSN) acquired a lot of attention due to their widespread use in monitoring hostile environments, critical surveillance and security applications. In these applications, usage of wireless terminals also has grown significantly. Grouping of Sensor Nodes (SN) is called clustering and these sensor nodes are burdened by the exchange of messages caused due to successive and recurring re-clustering, which results in power loss. Since most of the SNs are fitted with nonrechargeable batteries, currently researchers have been concentrating their efforts on enhancing the longevity of these nodes. For battery constrained WSN concerns, the clustering mechanism has emerged as a desirable subject since it is predominantly good at conserving the resources especially energy for network activities. This proposed work addresses the problem of load balancing and Cluster Head (CH) selection in cluster with minimum energy expenditure. So here, we propose hybrid method in which cluster formation is done using unsupervised machine learning based kmeans algorithm and Fuzzy-logic approach for CH selection.
Źródło:
International Journal of Electronics and Telecommunications; 2023, 69, 4; 793--801
2300-1933
Pojawia się w:
International Journal of Electronics and Telecommunications
Dostawca treści:
Biblioteka Nauki
Artykuł
Tytuł:
DSMK-means “density-based split-and-Merge K-means clustering algorithm
Autorzy:
Aldahdooh, R. T.
Ashour, W.
Powiązania:
https://bibliotekanauki.pl/articles/91719.pdf
Data publikacji:
2013
Wydawca:
Społeczna Akademia Nauk w Łodzi. Polskie Towarzystwo Sieci Neuronowych
Tematy:
clustering
K-means
Density-based Split
Merge K-means clustering Algorithm
DSMK-means
clustering algorithm
Opis:
Clustering is widely used to explore and understand large collections of data. K-means clustering method is one of the most popular approaches due to its ease of use and simplicity to implement. This paper introduces Density-based Split- and -Merge K-means clustering Algorithm (DSMK-means), which is developed to address stability problems of standard K-means clustering algorithm, and to improve the performance of clustering when dealing with datasets that contain clusters with different complex shapes and noise or outliers. Based on a set of many experiments, this paper concluded that developed algorithms “DSMK-means” are more capable of finding high accuracy results compared with other algorithms especially as they can process datasets containing clusters with different shapes, densities, or those with outliers and noise.
Źródło:
Journal of Artificial Intelligence and Soft Computing Research; 2013, 3, 1; 51-71
2083-2567
2449-6499
Pojawia się w:
Journal of Artificial Intelligence and Soft Computing Research
Dostawca treści:
Biblioteka Nauki
Artykuł
Tytuł:
A feasible k-means kernel trick under non-Euclidean feature space
Autorzy:
Kłopotek, Robert
Kłopotek, Mieczysław
Wierzchoń, Sławomir
Powiązania:
https://bibliotekanauki.pl/articles/1838163.pdf
Data publikacji:
2020
Wydawca:
Uniwersytet Zielonogórski. Oficyna Wydawnicza
Tematy:
kernel method
k-means
non-Euclidean feature space
Gower and Legendre theorem
Opis:
This paper poses the question of whether or not the usage of the kernel trick is justified. We investigate it for the special case of its usage in the kernel k-means algorithm. Kernel-k-means is a clustering algorithm, allowing clustering data in a similar way to k-means when an embedding of data points into Euclidean space is not provided and instead a matrix of “distances” (dissimilarities) or similarities is available. The kernel trick allows us to by-pass the need of finding an embedding into Euclidean space. We show that the algorithm returns wrong results if the embedding actually does not exist. This means that the embedding must be found prior to the usage of the algorithm. If it is found, then the kernel trick is pointless. If it is not found, the distance matrix needs to be repaired. But the reparation methods require the construction of an embedding, which first makes the kernel trick pointless, because it is not needed, and second, the kernel-k-means may return different clusterings prior to repairing and after repairing so that the value of the clustering is questioned. In the paper, we identify a distance repairing method that produces the same clustering prior to its application and afterwards and does not need to be performed explicitly, so that the embedding does not need to be constructed explicitly. This renders the kernel trick applicable for kernel-k-means.
Źródło:
International Journal of Applied Mathematics and Computer Science; 2020, 30, 4; 703-715
1641-876X
2083-8492
Pojawia się w:
International Journal of Applied Mathematics and Computer Science
Dostawca treści:
Biblioteka Nauki
Artykuł
Tytuł:
Kernel K-Means clustering algorithm for identification of glaucoma in ophthalmology
Autorzy:
Stapor, K.
Bruckner, A.
Powiązania:
https://bibliotekanauki.pl/articles/333803.pdf
Data publikacji:
2005
Wydawca:
Uniwersytet Śląski. Wydział Informatyki i Nauki o Materiałach. Instytut Informatyki. Zakład Systemów Komputerowych
Tematy:
grupowanie
segmentacja obrazu
clustering
image segmentation
kernel-based learning
Opis:
This paper presents the improved version of the classification system for supporting glaucoma diagnosis in ophthalmology, proposed in [4]. In this paper we propose the new segmentation step based on the kernel K-Means clustering algorithm which enable for better classification performance.
Źródło:
Journal of Medical Informatics & Technologies; 2005, 9; 167-172
1642-6037
Pojawia się w:
Journal of Medical Informatics & Technologies
Dostawca treści:
Biblioteka Nauki
Artykuł
Tytuł:
Segmentacja sekwencji obrazów metodą korelacyjną
Segmentation of the image sequence using the correlation method
Autorzy:
Świta, R.
Suszyński, Z.
Powiązania:
https://bibliotekanauki.pl/articles/152568.pdf
Data publikacji:
2013
Wydawca:
Stowarzyszenie Inżynierów i Techników Mechaników Polskich
Tematy:
segmentacja
obrazy termiczne
korelacja
K-means
FCM
segmentation
thermal images
correlation
k-means
Opis:
Artykuł przedstawia nową metodę segmentacji sekwencji obrazów termicznych wyodrębniającą obszary o różnych właściwościach cieplnych. Metoda oparta jest na korelacji położenia i kształtu segmentów w poszczególnych kadrach sekwencji. Segmentacja pozwala zmniejszyć liczbę analizowanych obszarów do kilku tysięcy razy, co stwarza realne możliwości praktycznego wykorzystania tomografii termicznej. Opisana metoda jest porównana z algorytmami klasteryzacji K-Means i FCM. Zaletą algorytmu korelacyjnego jest automatyczne wyznaczanie liczby segmentów wyjściowych.
This paper presents a new method for segmentation of thermal image sequences. Its aim is to divide the sequence into segments with different thermal properties. The described algorithm is based on measurements of the position and shape correlation of the segments in successive frames of the sequence. It is composed of several stages. The first stage consists of segmenting consecutive frames of the sequence (Fig. 2). The second step is analysis of the similarity of each segment in each frame with respect to all other segments of all frames and synthesis of the intermediate segments (Fig. 4). The intermediate segments form the segmented output image using the depth buffer technique to resolve multiple pixel-to-segment assignments (Fig. 6). This method is a basis for the thermal analysis of solids, which results in discovering depth profiles of thermal properties for each area. The segmentation reduces the number of the analyzed areas down to a few thousand times, which creates real opportunities for practical application of thermal tomography. The new algorithm has been compared with the K means algorithm [2], and FCM [6], which minimizes the sum of pixel value deviations from the centers of the segments they are assigned to, for all frames of the sequence (Tab. 1). The advantage of the correlation method is automatic determination of the number of output segments in the image and maintaining the constant segmentation error when increasing the number of the processed frames.
Źródło:
Pomiary Automatyka Kontrola; 2013, R. 59, nr 7, 7; 680-683
0032-4140
Pojawia się w:
Pomiary Automatyka Kontrola
Dostawca treści:
Biblioteka Nauki
Artykuł
Tytuł:
Clustering of data represented by pairwise comparisons
Autorzy:
Dvoenko, Sergey
Powiązania:
https://bibliotekanauki.pl/articles/2183479.pdf
Data publikacji:
2022
Wydawca:
Polska Akademia Nauk. Instytut Badań Systemowych PAN
Tematy:
clustering
k-means
distance
similarity
Opis:
In this paper, experimental data, given in the form of pairwise comparisons, such as distances or similarities, are considered. Clustering algorithms for processing such data are developed based on the well-known k-means procedure. Relations to factor analysis are shown. The problems of improving clustering quality and of finding the proper number of clusters in the case of pairwise comparisons are considered. Illustrative examples are provided.
Źródło:
Control and Cybernetics; 2022, 51, 3; 343--387
0324-8569
Pojawia się w:
Control and Cybernetics
Dostawca treści:
Biblioteka Nauki
Artykuł
Tytuł:
An alternative extension of the k-means algorithm for clustering categorical data
Autorzy:
San, O. M.
Huynh, V. N.
Nakamori, Y.
Powiązania:
https://bibliotekanauki.pl/articles/907406.pdf
Data publikacji:
2004
Wydawca:
Uniwersytet Zielonogórski. Oficyna Wydawnicza
Tematy:
analiza skupień
dane kategoryczne
eksploracja danych
cluster analysis
categorical data
data mining
Opis:
Most of the earlier work on clustering has mainly been focused on numerical data whose inherent geometric properties can be exploited to naturally define distance functions between data points. Recently, the problem of clustering categorical data has started drawing interest. However, the computational cost makes most of the previous algorithms unacceptable for clustering very large databases. The k-means algorithm is well known for its efficiency in this respect. At the same time, working only on numerical data prohibits them from being used for clustering categorical data. The main contribution of this paper is to show how to apply the notion of "cluster centers'' on a dataset of categorical objects and how to use this notion for formulating the clustering problem of categorical objects as a partitioning problem. Finally, a k-means-like algorithm for clustering categorical data is introduced. The clustering performance of the algorithm is demonstrated with two well-known data sets, namely, soybean disease and nursery databases.
Źródło:
International Journal of Applied Mathematics and Computer Science; 2004, 14, 2; 241-247
1641-876X
2083-8492
Pojawia się w:
International Journal of Applied Mathematics and Computer Science
Dostawca treści:
Biblioteka Nauki
Artykuł
Tytuł:
Przegląd technik grupowania danych i obszary zastosowań
Autorzy:
Sala, Karolina
Powiązania:
https://bibliotekanauki.pl/articles/2157869.pdf
Data publikacji:
2017
Wydawca:
Instytut Studiów Międzynarodowych i Edukacji Humanum
Tematy:
cluster analysis
hierarchical clustering
k-means
Opis:
The paper presents an overview of various clustering techniques used in data mining. Clustering is an unsupervised learning problem that is used to identify groups in a set of unlabeled data. Data is grouped by probability so that objects of the same group / cluster have similar properties / characteristics [1]. This article aims at exploring and comparing different clustering algorithms. Grouping is used in many areas, including machine learning, pattern recognition, image analysis, information retrieval.
Źródło:
Społeczeństwo i Edukacja. Międzynarodowe Studia Humanistyczne; 2017, 2(25); 141-145
1898-0171
Pojawia się w:
Społeczeństwo i Edukacja. Międzynarodowe Studia Humanistyczne
Dostawca treści:
Biblioteka Nauki
Artykuł
Tytuł:
Alarm Correlation in Mobile Telecommunications Networks based on k-means Cluster Analysis Method
Autorzy:
Maździarz, A.
Powiązania:
https://bibliotekanauki.pl/articles/308715.pdf
Data publikacji:
2018
Wydawca:
Instytut Łączności - Państwowy Instytut Badawczy
Tematy:
alarm correlation
alarm patterns
cluster analysis
mobile telecommunication network
root cause analysis
Opis:
Event correlation and root cause analysis play a fundamental role in the process of troubleshooting all technical faults and malfunctions. An in-depth, complicated multiprotocol analysis can be greatly supported or even replaced by a troubleshooting methodology based on data analysis approaches. The mobile telecommunications domain has been experiencing rapid development recently. Introduction of new technologies and services, as well as multivendor environment distributed across the same geographical area create a lot of challenges in network operation routines. Maintenance tasks have been recently becoming more and more complicated, time consuming and require big data analyses to be performed. Most network maintenance activities are completed manually by experts using raw network management information available in the network management system via multiple applications and direct database queries. With these circumstances considered, identification of network failures is a very difficult, if not an impossible task. This explains why effective yet simple tools and methods providing network operators with carefully selected, essential information are needed. Hence, in this paper efficient approximated alarm correlation algorithm based on the k-means cluster analysis method is proposed.
Źródło:
Journal of Telecommunications and Information Technology; 2018, 2; 95-102
1509-4553
1899-8852
Pojawia się w:
Journal of Telecommunications and Information Technology
Dostawca treści:
Biblioteka Nauki
Artykuł
Tytuł:
Anomaly detection in a cutting tool by k-means clustering and support vector machines
Autorzy:
Lahrache, A.
Cocconcelli, M.
Rubini, R.
Powiązania:
https://bibliotekanauki.pl/articles/328445.pdf
Data publikacji:
2017
Wydawca:
Polska Akademia Nauk. Polskie Towarzystwo Diagnostyki Technicznej PAN
Tematy:
knife diagnostics
k-means
hierarchical clustering
support vector machines
diagnostyka
grupowanie hierarchiczne
Opis:
This paper concerns the analysis of experimental data, verifying the applicability of signal analysis techniques for condition monitoring of a packaging machine. In particular, the activity focuses on the cutting process that divides a continuous flow of packaging paper into single packages. The cutting process is made by a steel knife driven by a hydraulic system. Actually, the knives are frequently substituted, causing frequent stops of the machine and consequent lost production costs. The aim of this paper is to develop a diagnostic procedure to assess the wearing condition of blades, reducing the stops for maintenance. The packaging machine was provided with pressure sensor that monitors the hydraulic system driving the blade. Processing the pressure data comprises three main steps: the selection of scalar quantities that could be indicative of the condition of the knife. A clustering analysis was used to set up a threshold between unfaulted and faulted knives. Finally, a Support Vector Machine (SVM) model was applied to classify the technical condition of knife during its lifetime.
Źródło:
Diagnostyka; 2017, 18, 3; 21-29
1641-6414
2449-5220
Pojawia się w:
Diagnostyka
Dostawca treści:
Biblioteka Nauki
Artykuł
Tytuł:
The number of clusters in hybrid predictive models: does it really matter?
Autorzy:
Łapczyński, Mariusz
Jefmański, Bartłomiej
Powiązania:
https://bibliotekanauki.pl/articles/1046637.pdf
Data publikacji:
2020
Wydawca:
Główny Urząd Statystyczny
Tematy:
hybrid predictive model
k-means algorithm
decision trees
Opis:
For quite a long time, research studies have attempted to combine various analytical tools to build predictive models. It is possible to combine tools of the same type (ensemble models, committees) or tools of different types (hybrid models). Hybrid models are used in such areas as customer relationship management (CRM), web usage mining, medical sciences, petroleum geology and anomaly detection in computer networks. Our hybrid model was created as a sequential combination of a cluster analysis and decision trees. In the first step of the procedure, objects were grouped into clusters using the k-means algorithm. The second step involved building a decision tree model with a new independent variable that indicated which cluster the objects belonged to. The analysis was based on 14 data sets collected from publicly accessible repositories. The performance of the models was assessed with the use of measures derived from the confusion matrix, including the accuracy, precision, recall, F-measure, and the lift in the first and second decile. We tried to find a relationship between the number of clusters and the quality of hybrid predictive models. According to our knowledge, similar studies have not been conducted yet. Our research demonstrates that in some cases building hybrid models can improve the performance of predictive models. It turned out that the models with the highest performance measures require building a relatively large number of clusters (from 9 to 15).
Źródło:
Przegląd Statystyczny; 2019, 66, 3; 228-238
0033-2372
Pojawia się w:
Przegląd Statystyczny
Dostawca treści:
Biblioteka Nauki
Artykuł
Tytuł:
Porównanie wydajności algorytmu k-means zaimplementowanego w języku X10 i środowisku C++/MPI
Performance comparison of the k-means algorithm implemented in the X10 programming language and the C++/MPI environment
Autorzy:
Wyrzykowski, R.
Karoń, T.
Powiązania:
https://bibliotekanauki.pl/articles/91405.pdf
Data publikacji:
2016
Wydawca:
Warszawska Wyższa Szkoła Informatyki
Tematy:
algorytm k-średnich
język programowania X10
środowisko C++/MPI
porównanie
k-means algorithm
X10 programming language
C++/MPI environment
comparison
Opis:
W pracy opisano algorytm k-średnich oraz sposób jego implementacji w języku X10. Dokonano porównania tego rozwiązania z implementacją w języku C++11 z wykorzystaniem standardu MPI. Stwierdzono, że implementacja w języku X10 jest szybsza przy większej liczbie procesorów realizujących obliczenia niż implementacja w środowisku C++/MPI. Kod zapisany w języku X10 jest o 59% krótszy od kodu dla kombinacji C++/MPI.
In this work the k-means algorithm and the way of its implementation in the X10 programming language are described. The achieved results are compared with the implementation of the same algorithm in the C++11 programming language using the MPI standard. It was confirmed that the implementation in the X10 programming language is faster on a large number of processors than the implementation in the C++/MPI environment. Additionally, the X10 code is about 59% shorter than the code for the C++/MPI combination.
Źródło:
Zeszyty Naukowe Warszawskiej Wyższej Szkoły Informatyki; 2016, 10, 14; 7-35
1896-396X
2082-8349
Pojawia się w:
Zeszyty Naukowe Warszawskiej Wyższej Szkoły Informatyki
Dostawca treści:
Biblioteka Nauki
Artykuł
Tytuł:
Initial Results of Nonhierarchical Cluster Methods Use for Low Flow Grouping
Autorzy:
Cupak, A.
Powiązania:
https://bibliotekanauki.pl/articles/123583.pdf
Data publikacji:
2017
Wydawca:
Polskie Towarzystwo Inżynierii Ekologicznej
Tematy:
low flow
K-means method
nonhierarchical cluster analysis
Opis:
In the paper the possibility of using statistical method for data agglomeration, i.e. nonhierarchical cluster analysis for low flow grouping was made. The study material included daily flows from the multi-year period of 1963–1983 collected for 19 catchments, located in the upper Vistula basin. Regions with the same flow were determined with the use of nonhierarchical cluster analysis (K-means). Groups were characterized by low flow and selected physiographic and meteorological features of the catchments. The procedure of catchments assigning to the clusters was started from two clusters and finished at five. The next moving and assigning of catchments into clusters resulted in a cluster in which there was only one catchment (for five clusters). Another objects’ delineation did not give an objective effects, based on which it was difficult to determine a clear criterion of assigning each catchments into the clusters. The last step involved development of the models reflecting correlation and regression relationships. The identified clusters comprised catchments similar in terms of unit runoff, watercourse length, mean precipitation, median altitude, mean catchment slope, watercourse staff gauge zero, area covered by coniferous forests, arable lands, and soils.
Źródło:
Journal of Ecological Engineering; 2017, 18, 2; 44-50
2299-8993
Pojawia się w:
Journal of Ecological Engineering
Dostawca treści:
Biblioteka Nauki
Artykuł
Tytuł:
Geodesic distances for clustering linked text data
Autorzy:
Tekir, S.
Mansmann, F.
Keimer, D.
Powiązania:
https://bibliotekanauki.pl/articles/91737.pdf
Data publikacji:
2012
Wydawca:
Społeczna Akademia Nauk w Łodzi. Polskie Towarzystwo Sieci Neuronowych
Tematy:
clustering
geodesic distance
text data
k-means algorithm
cosine distance
k-harmonic means
microprecision values
Opis:
The quality of a clustering not only depends on the chosen algorithm and its parameters, but also on the definition of the similarity of two respective objects in a dataset. Applications such as clustering of web documents is traditionally built either on textual similarity measures or on link information. Due to the incompatibility of these two information spaces, combining these two information sources in one distance measure is a challenging issue. In this paper, we thus propose a geodesic distance function that combines traditional similarity measures with link information. In particular, we test the effectiveness of geodesic distances as similarity measures under the space assumption of spherical geometry in a 0-sphere. Our proposed distance measure is thus a combination of the cosine distance of the term-document matrix and some curvature values in the geodesic distance formula. To estimate these curvature values, we calculate clustering coefficient values for every document from the link graph of the data set and increase their distinctiveness by means of a heuristic as these clustering coefficient values are rough estimates of the curvatures. To evaluate our work, we perform clustering tests with the k-means algorithm on a subset of the EnglishWikipedia hyperlinked data set with both traditional cosine distance and our proposed geodesic distance. Additionally, taking inspiration from the unified view of the performance functions of k-means and k-harmonic means, min and harmonic average of the cosine and geodesic distances are taken in order to construct alternate distance forms. The effectiveness of our approach is measured by computing microprecision values of the clusters based on the provided categorical information of each article.
Źródło:
Journal of Artificial Intelligence and Soft Computing Research; 2012, 2, 3; 247-258
2083-2567
2449-6499
Pojawia się w:
Journal of Artificial Intelligence and Soft Computing Research
Dostawca treści:
Biblioteka Nauki
Artykuł
Tytuł:
A proposal of a new method of choosing starting points for k-means grouping
Propozycja nowej metody wyboru punktów startowych do grupowania metodą k-średnich
Autorzy:
Korzeniewski, Jerzy
Powiązania:
https://bibliotekanauki.pl/articles/907035.pdf
Data publikacji:
2008
Wydawca:
Uniwersytet Łódzki. Wydawnictwo Uniwersytetu Łódzkiego
Tematy:
cluster analysis
starting points
silhouette indices
k-means method
Opis:
When one groups set elements with the help of k-means it is crucial to choose starting points properly. If they are chosen incorrectly one may arrive at badly grouped elements. In the paper a new method of choosing starting points is proposed. It is based on the distance matrix only. Starting points are chosen so as to improve the classical method of choosing points which are as far from one another as possible. The quality of grouping is assessed by means of silhouette indices — it is compared with the quality of grouping done with randomly chosen starting points and with maximum distance interval method. Sets from Euclidean spaces are generated with the help of CLUSTGEN software written by J. Milligana.
Gdy grupujemy punkty zbioru metodą k-średnich to zasadniczym problemem jest właściwy wybór punktów startowych. Jeśli są one źle wybrane to grupowanie może być złe. W artykule zaproponowana jest nowa metoda wyboru punktów startowych. Metoda ta jest oparta wyłącznie na znajomości macierzy odległości. Punkty startowe są wybierane tak, by poprawić wybór, który otrzymamy przy pomocy metody klasycznej polegającej na wyborze punktów możliwie jak najbardziej od siebie oddalonych. Jakość grupowania jest oceniana przy pomocy indeksów sylwetkowych - porównywana jest z jakością grupowania otrzymanego przy losowym wyborze punktów startowych oraz przy wyborze metodą klasyczną. Zbiory z przestrzeni euklidesowych są generowane przy pomocy programu CLUSTGEN autorstwa J. Milligana.
Źródło:
Acta Universitatis Lodziensis. Folia Oeconomica; 2008, 216
0208-6018
2353-7663
Pojawia się w:
Acta Universitatis Lodziensis. Folia Oeconomica
Dostawca treści:
Biblioteka Nauki
Artykuł
Tytuł:
Decision-making enhancement in a big data environment : application of the K-means algorithm to mixed data
Autorzy:
Koren, Oded
Hallin, Carina Antonia
Perel, Nir
Bendet, Dror
Powiązania:
https://bibliotekanauki.pl/articles/91712.pdf
Data publikacji:
2019
Wydawca:
Społeczna Akademia Nauk w Łodzi. Polskie Towarzystwo Sieci Neuronowych
Tematy:
big data
mixed data
hadoop
K-means
decision making
Opis:
Big data research has become an important discipline in information systems research. However, the flood of data being generated on the Internet is increasingly unstructured and non-numeric in the form of images and texts. Thus, research indicates that there is an increasing need to develop more efficient algorithms for treating mixed data in big data for effective decision making. In this paper, we apply the classical K-means algorithm to both numeric and categorical attributes in big data platforms. We first present an algorithm that handles the problem of mixed data. We then use big data platforms to implement the algorithm, demonstrating its functionalities by applying the algorithm in a detailed case study. This provides us with a solid basis for performing more targeted profiling for decision making and research using big data. Consequently, the decision makers will be able to treat mixed data, numerical and categorical data, to explain and predict phenomena in the big data ecosystem. Our research includes a detailed end-to-end case study that presents an implementation of the suggested procedure. This demonstrates its capabilities and the advantages that allow it to improve the decision-making process by targeting organizations’ business requirements to a specific cluster[s]/profiles[s] based on the enhancement outcomes.
Źródło:
Journal of Artificial Intelligence and Soft Computing Research; 2019, 9, 4; 293-302
2083-2567
2449-6499
Pojawia się w:
Journal of Artificial Intelligence and Soft Computing Research
Dostawca treści:
Biblioteka Nauki
Artykuł
Tytuł:
Data Mining Application in Air Transportation – the Case of Turkish Airlines
Autorzy:
Pisarek, Renata
Akpinar, Musab Talha
Hızıroglu, Abdulkadir
Powiązania:
https://bibliotekanauki.pl/articles/504638.pdf
Data publikacji:
2017
Wydawca:
Międzynarodowa Wyższa Szkoła Logistyki i Transportu
Tematy:
data mining
K-means
airlines
air transport
Turkish Airlines
Opis:
The paper presents an exemplification of data mining techniques in aviation industry on the basis of Turkish Airlines. The purpose of the paper is to present application of data mining on the selected operational data, concerning international flight passenger baggage data, in year 2015. The differences in passenger and flight profiles have been examined. Firstly, two-steps approach allowed defining the number of clusters. Secondly, K-means clustering were applied to divide data into a certain number of clusters representing the different areas of consumption. Results can contribute to higher efficiency in decision making regarding destination offer and fleet management.
Źródło:
Logistics and Transport; 2017, 36, 4; 79-88
1734-2015
Pojawia się w:
Logistics and Transport
Dostawca treści:
Biblioteka Nauki
Artykuł
Tytuł:
An Efficient Controller Placement Algorithm using Clustering in Software Defined Networks
Autorzy:
Jacob, Joshua
Shinde, Sumedha
Narayan, D. G.
Powiązania:
https://bibliotekanauki.pl/articles/27312951.pdf
Data publikacji:
2023
Wydawca:
Instytut Łączności - Państwowy Instytut Badawczy
Tematy:
clustering
controller placement
PAM
K-means++
silhouette score
SDN
Opis:
Software defined networking (SDN) is an emerging network paradigm that separates the control plane from data plane and ensures programmable network management. In SDN, the control plane is responsible for decision-making, while packet forwarding is handled by the data plane based on flow entries defined by the control plane. The placement of controllers is an important research issue that significantly impacts the performance of SDN. In this work, we utilize clustering techniques to group networks into multiple clusters and propose an algorithm for optimal controller placement within each cluster. The evaluation involves the use of the Mininet emulator with POX as the SDN controller. By employing the silhouette score, we determine the optimal number of controllers for various topologies. Additionally, to enhance network performance, we employ the meeting point algorithm to calculate the best location for placing the controller within each cluster. The proposed approach is compared with existing works in terms of throughput, delay, and jitter using six topologies from the Internet Zoo dataset.
Źródło:
Journal of Telecommunications and Information Technology; 2023, 4; 9--17
1509-4553
1899-8852
Pojawia się w:
Journal of Telecommunications and Information Technology
Dostawca treści:
Biblioteka Nauki
Artykuł
Tytuł:
Supporting investment decisions using data mining methods
Autorzy:
Sysiak, W.
Trajer, J.
Janaszek, M.
Powiązania:
https://bibliotekanauki.pl/articles/93017.pdf
Data publikacji:
2009
Wydawca:
Uniwersytet Przyrodniczo-Humanistyczny w Siedlcach
Tematy:
data mining
decision support
k-means clustering
neural networks
Opis:
This paper presents an application of k-means clustering in preliminary data analysis which preceded the choice of input variables for the system supporting the decision about stock purchase or sale on capital markets. The model forecasting share prices issued by companies in the food-processing sector quoted at the Warsaw Stock Exchange was created in STATISTICA 7.1. It was based on neural modeling and allowed for the assessment of changes direction in securities values (increase, decrease) and generates the quantitative forecast of their future price.
Źródło:
Studia Informatica : systems and information technology; 2009, 1(12); 67-78
1731-2264
Pojawia się w:
Studia Informatica : systems and information technology
Dostawca treści:
Biblioteka Nauki
Artykuł
Tytuł:
Implementation of Big Data Concept for Variability Mapping Control of Financing Assessment of Informal Sector Workers in Bogor City
Autorzy:
Salmah, Salmah
Andria, Fredi
Wahyudin, Irfan
Powiązania:
https://bibliotekanauki.pl/articles/1065325.pdf
Data publikacji:
2019
Wydawca:
Przedsiębiorstwo Wydawnictw Naukowych Darwin / Scientific Publishing House DARWIN
Tematy:
Big Data
Cluster
Informal Worker Sector
K-Means Clustering
Opis:
At present risks and uncertainties occur in protecting health for the community. This requires a national health insurance program that can guarantee health care costs. One of the program participants is a resident who works in the informal sector. This group is vulnerable as well as the potential for the implementation of health insurance programs. However, the level of participation of informal sector workers is still low, so an analysis of the constraints affecting it is needed. This study aims to identify categories of informal sector workers and analyze various obstacles faced by informal sector workers to become health insurance participants in the city of Bogor. The method used is the concept of big data with K-means clustering data mining techniques to group informal sector workers along with the constraints that exist in each of these groups. The results showed that there were 3 clusters with very low Social Security Administrator (BPJS) health ownership, namely cluster 1, cluster 3, and cluster 5. Each cluster had different constraints. Cluster 1 has constraints on the number of dependents it has, Cluster 3 has constraints on the gender side that are dominated by women, while Cluster 5 has constraints on the low-income side. Each cluster has a different obstacle resolution recommendation, namely for cluster 1 by registering workers in JKN contribution recipient (PBI) participants, cluster 2 by giving outreach to women who have only focused on men, and for clusters 5 by involving the community as a forum for the empowerment of informal sector workers.
Źródło:
World Scientific News; 2019, 135; 261-282
2392-2192
Pojawia się w:
World Scientific News
Dostawca treści:
Biblioteka Nauki
Artykuł
Tytuł:
Employment and economic entities in the Polish financial sector from 2005-2016
Zatrudnienie i podmioty ekonomiczne w polskim sektorze finansowym w latach 2005-2016
Autorzy:
Grzywińska-Rąpca, Małgorzata
Markowski, Lesław
Powiązania:
https://bibliotekanauki.pl/articles/425058.pdf
Data publikacji:
2018
Wydawca:
Wydawnictwo Uniwersytetu Ekonomicznego we Wrocławiu
Tematy:
financial sector
unemployment
economic entities
k-means method
trends
Opis:
The article analyzes employment in the financial sector and entities conducting financial, insurance or other activities. The aim of this study is to examine employment in the financial sector at the level of provinces and registered entities of this sector using multidimensional methods of statistical analysis. The results of the classification indicate the geographical division of the country in terms of the number of financial and insurance companies. However, the high slope of the directional coefficient means a very strong, growing tendency for the Mazowieckie voivodship, characterized by a much slower trend for the Dolnośląskie, Pomorskie and Śląskie voivodships. In fact, for most of the provinces, trends indicate a statistically significant, negative development trend for the analyzed phenomenon from 2005-2016.
Źródło:
Econometrics. Ekonometria. Advances in Applied Data Analytics; 2018, 22, 1; 79-93
1507-3866
Pojawia się w:
Econometrics. Ekonometria. Advances in Applied Data Analytics
Dostawca treści:
Biblioteka Nauki
Artykuł
Tytuł:
Wielowymiarowa analiza porównawcza jako narzędzie oceny spółek deweloperskich notowanych na GPW
Multivariate comparative analysis as a toolto evaluate the development of companies listed on the Warsaw Stock Exchange
Autorzy:
Chrzanowska, Mariola
Zielińska-Sitkiewicz, Monika
Powiązania:
https://bibliotekanauki.pl/articles/425133.pdf
Data publikacji:
2013
Wydawca:
Wydawnictwo Uniwersytetu Ekonomicznego we Wrocławiu
Tematy:
Ward’s method
k-means method
Polish developer companies
Opis:
The diversity and multiplicity of information associated with investment in the stock market can cause problems with the proper understanding of the analyzed phenomena. In particular it refers to small investors who invest directly in stocks. Therefore, evaluating the financial condition of listed companies is very important, hence the need to use methods that will simplify and thus make stock market analysis easier. This paper presents an attempt to apply the selected financial ratios for the classification of 17 real estate companies listed on the Warsaw Stock Exchange into groups characterized by a similar economic condition. In the study multidimensional comparative analysis was used, i.e. Ward’s method and the method of k-means. The analysis was carried out in the period 2010-2012. In the experiment it was proved that using Ward’s method could identify companies with the weakest condition.
Źródło:
Econometrics. Ekonometria. Advances in Applied Data Analytics; 2013, 4(42); 60-71
1507-3866
Pojawia się w:
Econometrics. Ekonometria. Advances in Applied Data Analytics
Dostawca treści:
Biblioteka Nauki
Artykuł

Ta witryna wykorzystuje pliki cookies do przechowywania informacji na Twoim komputerze. Pliki cookies stosujemy w celu świadczenia usług na najwyższym poziomie, w tym w sposób dostosowany do indywidualnych potrzeb. Korzystanie z witryny bez zmiany ustawień dotyczących cookies oznacza, że będą one zamieszczane w Twoim komputerze. W każdym momencie możesz dokonać zmiany ustawień dotyczących cookies