Informacja

Drogi użytkowniku, aplikacja do prawidłowego działania wymaga obsługi JavaScript. Proszę włącz obsługę JavaScript w Twojej przeglądarce.

Wyszukujesz frazę "k-means ++" wg kryterium: Temat


Tytuł:
Inicjalizacja segmentacji k-means uwzględniająca rozkład gęstości pikseli
Autorzy:
Świta, R.
Suszyński, Z.
Powiązania:
https://bibliotekanauki.pl/articles/118366.pdf
Data publikacji:
2014
Wydawca:
Politechnika Koszalińska. Wydawnictwo Uczelniane
Tematy:
FA
KKZ
k-means
kmeans++
segmentacja
k-means ++
segmentation
High Density
Opis:
Artykuł przedstawia modyfikację inicjalizacji KKZ algorytmu k-means, uwzględniającą, oprócz wzajemnych odległości środków segmentów, również rozkład gęstości pikseli. Funkcja gęstości piksela jest sumą odwrotności odległości piksela od pozostałych i jest poddawana oszacowaniu na podstawie odległości piksela od wartości średniej i wariancji wartości pikseli. W eksperymentach segmentacji podlegały cztery różne sekwencje obrazów termicznych uzyskanych metodą termografii aktywnej. Pomimo dodatkowych obliczeń podczas inicjalizacji, metoda wykazała szybszą zbieżność algorytmu z czasami bardzo podobnymi do inicjalizacji KKZ, ale mniejszym błędem końcowym segmentacji.
This article presents a modification for the KKZ initialization of the k-means segmentation algorithm, which, in addition to the mutual distance of segments, takes into account the density of pixels. Pixel density is expressed asa sum of the inverse of the pixel’s distance to the other pixels and is subjected to estimation based on the distance from the mean and variance of the pixel values. In the experiments, four different sequences of thermal images were used, obtained using active thermography. Despite the additional calculations during initialization, method showed a faster convergence of the algorithm, with processing times very similar to the KKZ initialization, but with a lower final segmentation error.
Źródło:
Zeszyty Naukowe Wydziału Elektroniki i Informatyki Politechniki Koszalińskiej; 2014, 6; 89-98
1897-7421
Pojawia się w:
Zeszyty Naukowe Wydziału Elektroniki i Informatyki Politechniki Koszalińskiej
Dostawca treści:
Biblioteka Nauki
Artykuł
Tytuł:
Segmentacja sekwencji obrazów metodą korelacyjną
Segmentation of the image sequence using the correlation method
Autorzy:
Świta, R.
Suszyński, Z.
Powiązania:
https://bibliotekanauki.pl/articles/152568.pdf
Data publikacji:
2013
Wydawca:
Stowarzyszenie Inżynierów i Techników Mechaników Polskich
Tematy:
segmentacja
obrazy termiczne
korelacja
K-means
FCM
segmentation
thermal images
correlation
k-means
Opis:
Artykuł przedstawia nową metodę segmentacji sekwencji obrazów termicznych wyodrębniającą obszary o różnych właściwościach cieplnych. Metoda oparta jest na korelacji położenia i kształtu segmentów w poszczególnych kadrach sekwencji. Segmentacja pozwala zmniejszyć liczbę analizowanych obszarów do kilku tysięcy razy, co stwarza realne możliwości praktycznego wykorzystania tomografii termicznej. Opisana metoda jest porównana z algorytmami klasteryzacji K-Means i FCM. Zaletą algorytmu korelacyjnego jest automatyczne wyznaczanie liczby segmentów wyjściowych.
This paper presents a new method for segmentation of thermal image sequences. Its aim is to divide the sequence into segments with different thermal properties. The described algorithm is based on measurements of the position and shape correlation of the segments in successive frames of the sequence. It is composed of several stages. The first stage consists of segmenting consecutive frames of the sequence (Fig. 2). The second step is analysis of the similarity of each segment in each frame with respect to all other segments of all frames and synthesis of the intermediate segments (Fig. 4). The intermediate segments form the segmented output image using the depth buffer technique to resolve multiple pixel-to-segment assignments (Fig. 6). This method is a basis for the thermal analysis of solids, which results in discovering depth profiles of thermal properties for each area. The segmentation reduces the number of the analyzed areas down to a few thousand times, which creates real opportunities for practical application of thermal tomography. The new algorithm has been compared with the K means algorithm [2], and FCM [6], which minimizes the sum of pixel value deviations from the centers of the segments they are assigned to, for all frames of the sequence (Tab. 1). The advantage of the correlation method is automatic determination of the number of output segments in the image and maintaining the constant segmentation error when increasing the number of the processed frames.
Źródło:
Pomiary Automatyka Kontrola; 2013, R. 59, nr 7, 7; 680-683
0032-4140
Pojawia się w:
Pomiary Automatyka Kontrola
Dostawca treści:
Biblioteka Nauki
Artykuł
Tytuł:
DSMK-means “density-based split-and-Merge K-means clustering algorithm
Autorzy:
Aldahdooh, R. T.
Ashour, W.
Powiązania:
https://bibliotekanauki.pl/articles/91719.pdf
Data publikacji:
2013
Wydawca:
Społeczna Akademia Nauk w Łodzi. Polskie Towarzystwo Sieci Neuronowych
Tematy:
clustering
K-means
Density-based Split
Merge K-means clustering Algorithm
DSMK-means
clustering algorithm
Opis:
Clustering is widely used to explore and understand large collections of data. K-means clustering method is one of the most popular approaches due to its ease of use and simplicity to implement. This paper introduces Density-based Split- and -Merge K-means clustering Algorithm (DSMK-means), which is developed to address stability problems of standard K-means clustering algorithm, and to improve the performance of clustering when dealing with datasets that contain clusters with different complex shapes and noise or outliers. Based on a set of many experiments, this paper concluded that developed algorithms “DSMK-means” are more capable of finding high accuracy results compared with other algorithms especially as they can process datasets containing clusters with different shapes, densities, or those with outliers and noise.
Źródło:
Journal of Artificial Intelligence and Soft Computing Research; 2013, 3, 1; 51-71
2083-2567
2449-6499
Pojawia się w:
Journal of Artificial Intelligence and Soft Computing Research
Dostawca treści:
Biblioteka Nauki
Artykuł
Tytuł:
Clustering of data represented by pairwise comparisons
Autorzy:
Dvoenko, Sergey
Powiązania:
https://bibliotekanauki.pl/articles/2183479.pdf
Data publikacji:
2022
Wydawca:
Polska Akademia Nauk. Instytut Badań Systemowych PAN
Tematy:
clustering
k-means
distance
similarity
Opis:
In this paper, experimental data, given in the form of pairwise comparisons, such as distances or similarities, are considered. Clustering algorithms for processing such data are developed based on the well-known k-means procedure. Relations to factor analysis are shown. The problems of improving clustering quality and of finding the proper number of clusters in the case of pairwise comparisons are considered. Illustrative examples are provided.
Źródło:
Control and Cybernetics; 2022, 51, 3; 343--387
0324-8569
Pojawia się w:
Control and Cybernetics
Dostawca treści:
Biblioteka Nauki
Artykuł
Tytuł:
K-means is probabilistically poor
Autorzy:
Kłopotek, Mieczysław
Powiązania:
https://bibliotekanauki.pl/articles/2201613.pdf
Data publikacji:
2022
Wydawca:
Uniwersytet Przyrodniczo-Humanistyczny w Siedlcach
Tematy:
k-means
clustering
probabilistic k-richness
Opis:
Kleinberg introduced the concept of k-richness as a requirement for an algorithm to be a clustering algorithm. The most popular algorithm k means dos not fit this definition because of its probabilistic nature. Hence Ackerman et al. proposed the notion of probabilistic k-richness claiming without proof that k-means has this property. It is proven in this paper, by example, that the version of k-means with random initialization does not have the property probabilistic k-richness, just rebuking Ackeman's claim.
Źródło:
Studia Informatica : systems and information technology; 2022, 2(27); 5--26
1731-2264
Pojawia się w:
Studia Informatica : systems and information technology
Dostawca treści:
Biblioteka Nauki
Artykuł
Tytuł:
Przegląd technik grupowania danych i obszary zastosowań
Autorzy:
Sala, Karolina
Powiązania:
https://bibliotekanauki.pl/articles/2157869.pdf
Data publikacji:
2017
Wydawca:
Instytut Studiów Międzynarodowych i Edukacji Humanum
Tematy:
cluster analysis
hierarchical clustering
k-means
Opis:
The paper presents an overview of various clustering techniques used in data mining. Clustering is an unsupervised learning problem that is used to identify groups in a set of unlabeled data. Data is grouped by probability so that objects of the same group / cluster have similar properties / characteristics [1]. This article aims at exploring and comparing different clustering algorithms. Grouping is used in many areas, including machine learning, pattern recognition, image analysis, information retrieval.
Źródło:
Społeczeństwo i Edukacja. Międzynarodowe Studia Humanistyczne; 2017, 2(25); 141-145
1898-0171
Pojawia się w:
Społeczeństwo i Edukacja. Międzynarodowe Studia Humanistyczne
Dostawca treści:
Biblioteka Nauki
Artykuł
Tytuł:
Geodesic distances for clustering linked text data
Autorzy:
Tekir, S.
Mansmann, F.
Keimer, D.
Powiązania:
https://bibliotekanauki.pl/articles/91737.pdf
Data publikacji:
2012
Wydawca:
Społeczna Akademia Nauk w Łodzi. Polskie Towarzystwo Sieci Neuronowych
Tematy:
clustering
geodesic distance
text data
k-means algorithm
cosine distance
k-harmonic means
microprecision values
Opis:
The quality of a clustering not only depends on the chosen algorithm and its parameters, but also on the definition of the similarity of two respective objects in a dataset. Applications such as clustering of web documents is traditionally built either on textual similarity measures or on link information. Due to the incompatibility of these two information spaces, combining these two information sources in one distance measure is a challenging issue. In this paper, we thus propose a geodesic distance function that combines traditional similarity measures with link information. In particular, we test the effectiveness of geodesic distances as similarity measures under the space assumption of spherical geometry in a 0-sphere. Our proposed distance measure is thus a combination of the cosine distance of the term-document matrix and some curvature values in the geodesic distance formula. To estimate these curvature values, we calculate clustering coefficient values for every document from the link graph of the data set and increase their distinctiveness by means of a heuristic as these clustering coefficient values are rough estimates of the curvatures. To evaluate our work, we perform clustering tests with the k-means algorithm on a subset of the EnglishWikipedia hyperlinked data set with both traditional cosine distance and our proposed geodesic distance. Additionally, taking inspiration from the unified view of the performance functions of k-means and k-harmonic means, min and harmonic average of the cosine and geodesic distances are taken in order to construct alternate distance forms. The effectiveness of our approach is measured by computing microprecision values of the clusters based on the provided categorical information of each article.
Źródło:
Journal of Artificial Intelligence and Soft Computing Research; 2012, 2, 3; 247-258
2083-2567
2449-6499
Pojawia się w:
Journal of Artificial Intelligence and Soft Computing Research
Dostawca treści:
Biblioteka Nauki
Artykuł
Tytuł:
The number of clusters in hybrid predictive models: does it really matter?
Autorzy:
Łapczyński, Mariusz
Jefmański, Bartłomiej
Powiązania:
https://bibliotekanauki.pl/articles/1046637.pdf
Data publikacji:
2020
Wydawca:
Główny Urząd Statystyczny
Tematy:
hybrid predictive model
k-means algorithm
decision trees
Opis:
For quite a long time, research studies have attempted to combine various analytical tools to build predictive models. It is possible to combine tools of the same type (ensemble models, committees) or tools of different types (hybrid models). Hybrid models are used in such areas as customer relationship management (CRM), web usage mining, medical sciences, petroleum geology and anomaly detection in computer networks. Our hybrid model was created as a sequential combination of a cluster analysis and decision trees. In the first step of the procedure, objects were grouped into clusters using the k-means algorithm. The second step involved building a decision tree model with a new independent variable that indicated which cluster the objects belonged to. The analysis was based on 14 data sets collected from publicly accessible repositories. The performance of the models was assessed with the use of measures derived from the confusion matrix, including the accuracy, precision, recall, F-measure, and the lift in the first and second decile. We tried to find a relationship between the number of clusters and the quality of hybrid predictive models. According to our knowledge, similar studies have not been conducted yet. Our research demonstrates that in some cases building hybrid models can improve the performance of predictive models. It turned out that the models with the highest performance measures require building a relatively large number of clusters (from 9 to 15).
Źródło:
Przegląd Statystyczny; 2019, 66, 3; 228-238
0033-2372
Pojawia się w:
Przegląd Statystyczny
Dostawca treści:
Biblioteka Nauki
Artykuł
Tytuł:
Initial Results of Nonhierarchical Cluster Methods Use for Low Flow Grouping
Autorzy:
Cupak, A.
Powiązania:
https://bibliotekanauki.pl/articles/123583.pdf
Data publikacji:
2017
Wydawca:
Polskie Towarzystwo Inżynierii Ekologicznej
Tematy:
low flow
K-means method
nonhierarchical cluster analysis
Opis:
In the paper the possibility of using statistical method for data agglomeration, i.e. nonhierarchical cluster analysis for low flow grouping was made. The study material included daily flows from the multi-year period of 1963–1983 collected for 19 catchments, located in the upper Vistula basin. Regions with the same flow were determined with the use of nonhierarchical cluster analysis (K-means). Groups were characterized by low flow and selected physiographic and meteorological features of the catchments. The procedure of catchments assigning to the clusters was started from two clusters and finished at five. The next moving and assigning of catchments into clusters resulted in a cluster in which there was only one catchment (for five clusters). Another objects’ delineation did not give an objective effects, based on which it was difficult to determine a clear criterion of assigning each catchments into the clusters. The last step involved development of the models reflecting correlation and regression relationships. The identified clusters comprised catchments similar in terms of unit runoff, watercourse length, mean precipitation, median altitude, mean catchment slope, watercourse staff gauge zero, area covered by coniferous forests, arable lands, and soils.
Źródło:
Journal of Ecological Engineering; 2017, 18, 2; 44-50
2299-8993
Pojawia się w:
Journal of Ecological Engineering
Dostawca treści:
Biblioteka Nauki
Artykuł
Tytuł:
Methods for imputation of missing values and their influence on the results of segmentation research
Metody uzupełniania braków danych i ich wpływ na wyniki badań segmentacyjnych.
Autorzy:
Gąsior, Marcin
Skowron, Łukasz
Powiązania:
https://bibliotekanauki.pl/articles/425241.pdf
Data publikacji:
2016
Wydawca:
Wydawnictwo Uniwersytetu Ekonomicznego we Wrocławiu
Tematy:
missing values
cluster analysis
k-means algorithm
k-medoids algorithm
Opis:
The lack of answers is a common problem in all types of research, especially in the field of social sciences. Hence a number of solutions were developed, including the analysis of complete cases or imputations that supplement the missing value with a value calculated according to different algorithms. This paper evaluates the influence of the adopted method for the supplementation of missing answers regarding the result of segmentation conducted with the use of cluster analysis. In order to achieve this we used a set of data from an actual consumer research in which the cases with missing values were deleted or supplemented with the use of various methods. Cluster analyses were then performed on those sets of data, both with the assumption of ordinal and ratio level of measurement, and then the grouping quality, as expressed by different indicators, was evaluated. This research proved the advantage of imputation over the analysis of complete cases, it also proved the validity of using more complex approaches than the simple supplementation with an average or median value.
Źródło:
Econometrics. Ekonometria. Advances in Applied Data Analytics; 2016, 4 (54); 61-71
1507-3866
Pojawia się w:
Econometrics. Ekonometria. Advances in Applied Data Analytics
Dostawca treści:
Biblioteka Nauki
Artykuł
Tytuł:
Development of Data-mining Technique for Seismic Vulnerability Assessment
Autorzy:
Wojcik, Waldemar
Karmenova, Markhaba
Smailova, Saule
Tlebaldinova, Aizhan
Belbeubaev, Alisher
Powiązania:
https://bibliotekanauki.pl/articles/1844631.pdf
Data publikacji:
2021
Wydawca:
Polska Akademia Nauk. Czytelnia Czasopism PAN
Tematy:
data analysis
seismic assessment
clustering
h-means
k-means
random forest
Opis:
Assessment of seismic vulnerability of urban infrastructure is an actual problem, since the damage caused by earthquakes is quite significant. Despite the complexity of such tasks, today’s machine learning methods allow the use of “fast” methods for assessing seismic vulnerability. The article proposes a methodology for assessing the characteristics of typical urban objects that affect their seismic resistance; using classification and clustering methods. For the analysis, we use kmeans and hkmeans clustering methods, where the Euclidean distance is used as a measure of proximity. The optimal number of clusters is determined using the Elbow method. A decision-making model on the seismic resistance of an urban object is presented, also the most important variables that have the greatest impact on the seismic resistance of an urban object are identified. The study shows that the results of clustering coincide with expert estimates, and the characteristic of typical urban objects can be determined as a result of data modeling using clustering algorithms.
Źródło:
International Journal of Electronics and Telecommunications; 2021, 67, 2; 261-266
2300-1933
Pojawia się w:
International Journal of Electronics and Telecommunications
Dostawca treści:
Biblioteka Nauki
Artykuł
Tytuł:
Data Mining Application in Air Transportation – the Case of Turkish Airlines
Autorzy:
Pisarek, Renata
Akpinar, Musab Talha
Hızıroglu, Abdulkadir
Powiązania:
https://bibliotekanauki.pl/articles/504638.pdf
Data publikacji:
2017
Wydawca:
Międzynarodowa Wyższa Szkoła Logistyki i Transportu
Tematy:
data mining
K-means
airlines
air transport
Turkish Airlines
Opis:
The paper presents an exemplification of data mining techniques in aviation industry on the basis of Turkish Airlines. The purpose of the paper is to present application of data mining on the selected operational data, concerning international flight passenger baggage data, in year 2015. The differences in passenger and flight profiles have been examined. Firstly, two-steps approach allowed defining the number of clusters. Secondly, K-means clustering were applied to divide data into a certain number of clusters representing the different areas of consumption. Results can contribute to higher efficiency in decision making regarding destination offer and fleet management.
Źródło:
Logistics and Transport; 2017, 36, 4; 79-88
1734-2015
Pojawia się w:
Logistics and Transport
Dostawca treści:
Biblioteka Nauki
Artykuł
Tytuł:
A proposal of a new method of choosing starting points for k-means grouping
Propozycja nowej metody wyboru punktów startowych do grupowania metodą k-średnich
Autorzy:
Korzeniewski, Jerzy
Powiązania:
https://bibliotekanauki.pl/articles/907035.pdf
Data publikacji:
2008
Wydawca:
Uniwersytet Łódzki. Wydawnictwo Uniwersytetu Łódzkiego
Tematy:
cluster analysis
starting points
silhouette indices
k-means method
Opis:
When one groups set elements with the help of k-means it is crucial to choose starting points properly. If they are chosen incorrectly one may arrive at badly grouped elements. In the paper a new method of choosing starting points is proposed. It is based on the distance matrix only. Starting points are chosen so as to improve the classical method of choosing points which are as far from one another as possible. The quality of grouping is assessed by means of silhouette indices — it is compared with the quality of grouping done with randomly chosen starting points and with maximum distance interval method. Sets from Euclidean spaces are generated with the help of CLUSTGEN software written by J. Milligana.
Gdy grupujemy punkty zbioru metodą k-średnich to zasadniczym problemem jest właściwy wybór punktów startowych. Jeśli są one źle wybrane to grupowanie może być złe. W artykule zaproponowana jest nowa metoda wyboru punktów startowych. Metoda ta jest oparta wyłącznie na znajomości macierzy odległości. Punkty startowe są wybierane tak, by poprawić wybór, który otrzymamy przy pomocy metody klasycznej polegającej na wyborze punktów możliwie jak najbardziej od siebie oddalonych. Jakość grupowania jest oceniana przy pomocy indeksów sylwetkowych - porównywana jest z jakością grupowania otrzymanego przy losowym wyborze punktów startowych oraz przy wyborze metodą klasyczną. Zbiory z przestrzeni euklidesowych są generowane przy pomocy programu CLUSTGEN autorstwa J. Milligana.
Źródło:
Acta Universitatis Lodziensis. Folia Oeconomica; 2008, 216
0208-6018
2353-7663
Pojawia się w:
Acta Universitatis Lodziensis. Folia Oeconomica
Dostawca treści:
Biblioteka Nauki
Artykuł
Tytuł:
Implementation of Big Data Concept for Variability Mapping Control of Financing Assessment of Informal Sector Workers in Bogor City
Autorzy:
Salmah, Salmah
Andria, Fredi
Wahyudin, Irfan
Powiązania:
https://bibliotekanauki.pl/articles/1065325.pdf
Data publikacji:
2019
Wydawca:
Przedsiębiorstwo Wydawnictw Naukowych Darwin / Scientific Publishing House DARWIN
Tematy:
Big Data
Cluster
Informal Worker Sector
K-Means Clustering
Opis:
At present risks and uncertainties occur in protecting health for the community. This requires a national health insurance program that can guarantee health care costs. One of the program participants is a resident who works in the informal sector. This group is vulnerable as well as the potential for the implementation of health insurance programs. However, the level of participation of informal sector workers is still low, so an analysis of the constraints affecting it is needed. This study aims to identify categories of informal sector workers and analyze various obstacles faced by informal sector workers to become health insurance participants in the city of Bogor. The method used is the concept of big data with K-means clustering data mining techniques to group informal sector workers along with the constraints that exist in each of these groups. The results showed that there were 3 clusters with very low Social Security Administrator (BPJS) health ownership, namely cluster 1, cluster 3, and cluster 5. Each cluster had different constraints. Cluster 1 has constraints on the number of dependents it has, Cluster 3 has constraints on the gender side that are dominated by women, while Cluster 5 has constraints on the low-income side. Each cluster has a different obstacle resolution recommendation, namely for cluster 1 by registering workers in JKN contribution recipient (PBI) participants, cluster 2 by giving outreach to women who have only focused on men, and for clusters 5 by involving the community as a forum for the empowerment of informal sector workers.
Źródło:
World Scientific News; 2019, 135; 261-282
2392-2192
Pojawia się w:
World Scientific News
Dostawca treści:
Biblioteka Nauki
Artykuł
Tytuł:
Extending k-means with the description comes first approach
Autorzy:
Stefanowski, J.
Weiss, D.
Powiązania:
https://bibliotekanauki.pl/articles/970926.pdf
Data publikacji:
2007
Wydawca:
Polska Akademia Nauk. Instytut Badań Systemowych PAN
Tematy:
document clustering
cluster labels
k-means algorithm
information retrieval
Opis:
This paper describes a technique for clustering large collections of short and medium length text documents such as press articles, news stories and the like. The technique called description comes first (DCF) consists of identification of related document clusters, selection of salient phrases relevant to these clusters and reallocation of documents matching the selected phrases to form final document groups. The advantages of this technique include more comprehensive cluster labels and clearer (more transparent) relationship between cluster labels and their content. We demonstrate the DCF by taking a standard k-means algorithm as a baseline and weaving DCF elements into it; the outcome is the descriptive k-means (DKM) algorithm. The paper goes through technical background explaining how to implement DKM efficiently and ends with the description of an experiment measuring clustering quality on a benchmark document collection 20-newsgroups. Short fragments of this paper appeared at the poster session of the RIAO 2007 conference, Pittsburgh, PA, USA (electronic proceedings only).
Źródło:
Control and Cybernetics; 2007, 36, 4; 1009-1035
0324-8569
Pojawia się w:
Control and Cybernetics
Dostawca treści:
Biblioteka Nauki
Artykuł

Ta witryna wykorzystuje pliki cookies do przechowywania informacji na Twoim komputerze. Pliki cookies stosujemy w celu świadczenia usług na najwyższym poziomie, w tym w sposób dostosowany do indywidualnych potrzeb. Korzystanie z witryny bez zmiany ustawień dotyczących cookies oznacza, że będą one zamieszczane w Twoim komputerze. W każdym momencie możesz dokonać zmiany ustawień dotyczących cookies