Temat: K-means method - Katalog OPAC zbiorów

Skocz do pozycji: 1.

Tytuł:: A proposal of a new method of choosing starting points for k-means grouping
Propozycja nowej metody wyboru punktów startowych do grupowania metodą k-średnich
Autorzy:: Korzeniewski, Jerzy
Powiązania:: https://bibliotekanauki.pl/articles/907035.pdf
Data publikacji:: 2008
Wydawca:: Uniwersytet Łódzki. Wydawnictwo Uniwersytetu Łódzkiego
Tematy:: cluster analysis
starting points
silhouette indices
k-means method
Opis:: When one groups set elements with the help of k-means it is crucial to choose starting points properly. If they are chosen incorrectly one may arrive at badly grouped elements. In the paper a new method of choosing starting points is proposed. It is based on the distance matrix only. Starting points are chosen so as to improve the classical method of choosing points which are as far from one another as possible. The quality of grouping is assessed by means of silhouette indices — it is compared with the quality of grouping done with randomly chosen starting points and with maximum distance interval method. Sets from Euclidean spaces are generated with the help of CLUSTGEN software written by J. Milligana.
Gdy grupujemy punkty zbioru metodą k-średnich to zasadniczym problemem jest właściwy wybór punktów startowych. Jeśli są one źle wybrane to grupowanie może być złe. W artykule zaproponowana jest nowa metoda wyboru punktów startowych. Metoda ta jest oparta wyłącznie na znajomości macierzy odległości. Punkty startowe są wybierane tak, by poprawić wybór, który otrzymamy przy pomocy metody klasycznej polegającej na wyborze punktów możliwie jak najbardziej od siebie oddalonych. Jakość grupowania jest oceniana przy pomocy indeksów sylwetkowych - porównywana jest z jakością grupowania otrzymanego przy losowym wyborze punktów startowych oraz przy wyborze metodą klasyczną. Zbiory z przestrzeni euklidesowych są generowane przy pomocy programu CLUSTGEN autorstwa J. Milligana.
Źródło:: Acta Universitatis Lodziensis. Folia Oeconomica; 2008, 216
0208-6018
2353-7663
Pojawia się w:: Acta Universitatis Lodziensis. Folia Oeconomica
Dostawca treści:: Biblioteka Nauki

Artykuł

Zmień widok

na półce

Skocz do pozycji: 2.

Tytuł:: Spatial diversity of effectiveness of forms of professional activisation in Poland in years 2008-2014 by poviats
Autorzy:: Bieszk-Stolorz, Beata
Dmytrów, Krzysztof
Powiązania:: https://bibliotekanauki.pl/articles/19090896.pdf
Data publikacji:: 2019
Wydawca:: Instytut Badań Gospodarczych
Tematy:: registered unemployment
forms of professional activisation
cost and employment effectiveness
cluster analysis
k-means method
Opis:: Research background: Because the active labour market policy requires high resources, it is important to analyse the effectiveness of its instruments. For the unemployment, it is essential to identify the groups of persons threatened by the long-term unemployment, to assess the impact of programmes on exit from unemployment and monitoring the disbursement of funds. Purpose of the article: The goal of the article was identification of clusters of poviats in Poland with respect to cost and employment effectiveness of basic forms of professional activisation in the years 2008-2014. Methods: The poviats were clustered by means of the k-means method. Variables were standardised and the number of clusters was determined by means of the v-fold cross-validation. Findings & Value added: The analysis did not allow to unambiguously specify areas in Poland with better use of funds allocated in the activisation programmes. The poviats in the middle-east Poland were generally characterized by worse values of effectiveness. However, the unemployment rate in these areas was relatively small. On the contrary, the poviats in the north-east Poland had high unemployment rate and the funds were used effectively. Assessment of effectiveness of forms of professional activisation is very important because the activities of poviat labour offices influence the counteraction to unemployment.
Źródło:: Oeconomia Copernicana; 2019, 10, 1; 113-130
2083-1277
Pojawia się w:: Oeconomia Copernicana
Dostawca treści:: Biblioteka Nauki

Artykuł

Zmień widok

na półce

Skocz do pozycji: 3.

Tytuł:: Examining Similarities in Time Allocation Amongst European Countries
Autorzy:: Hozer-Koćmiel, Marta
Lis, Christian
Powiązania:: https://bibliotekanauki.pl/articles/465764.pdf
Data publikacji:: 2016
Wydawca:: Główny Urząd Statystyczny
Tematy:: time allocation
cluster analysis
k-means method
generalised distance measure GDM
interval taxonomic method TMI
HETUS survey
Opis:: The aim of the article is to analyse the similarities between the selected European countries in terms of time allocation. Time allocation has been defined as the daily distribution of time to various activities. Professional work time, domestic work time and leisure time are the most important for the economic approach. It has been proved that there are coherent groups of countries with similar structure of time allocation. The taxonomic methods used in order to verify the thesis included: cluster analysis, k-means method, generalised distance measure GDM and interval taxonomic method TMI. The analysis was performed on the basis of HETUS data.
Źródło:: Statistics in Transition new series; 2016, 17, 2; 217-330
1234-7655
Pojawia się w:: Statistics in Transition new series
Dostawca treści:: Biblioteka Nauki

Artykuł

Zmień widok

na półce

Skocz do pozycji: 4.

Tytuł:: Wykorzystanie języka R do statystycznej analizy oraz analizy skupień dla danych geochemicznych
Use of R programming language for statistical analysis and cluster analysis of geochemical data
Autorzy:: Janiga, Marek
Powiązania:: https://bibliotekanauki.pl/articles/31348311.pdf
Data publikacji:: 2023
Wydawca:: Instytut Nafty i Gazu - Państwowy Instytut Badawczy
Tematy:: analiza skupień
metoda k-średnich
metoda hierarchiczna
skład gazu ziemnego
cluster analysis
k-means method
hierarchical method
natural gas composition
Opis:: W zagadnieniach geologii naftowej metody statystyczne są szeroko stosowane w petrografii, petrofizyce, geochemii, geomechanice, geofizyce wiertniczej czy sejsmice, a analiza skupień jest istotna w klasyfikacji skał – wyznaczaniu stref o pewnych własnościach, np. macierzystych lub zbiornikowych. Artykuł prezentuje użycie metod statystycznych, w tym metod analizy skupień, w procesach przetwarzania i analizy dużych zbiorów różnorodnych danych geochemicznych. Do analiz statystycznych wykorzystano literaturowe dane z analiz składu chemicznego i izotopowego gazów ziemnych. Wyniki zawierały skład chemiczny gazów ziemnych oraz skład izotopowy. Zastosowano algorytmy tzw. nienadzorowanego uczenia maszynowego do przeprowadzenia analizy skupień. Grupowania było przeprowadzone dwiema metodami: k-średnich oraz hierarchiczną. Do zobrazowania wyników grupowania metodą k-średnich można wykorzystać dwuwymiarowy wykres (funkcja fviz_cluster języka R). Wymiary na wykresie to efekt analizy głównych składowych (PCA) i są one liniową kombinacją cech (kolumn w tabeli). Wynikiem grupowania metodą hierarchiczną jest wykres nazywany dendrogramem. W artykule dodatkowo zaprezentowano wykresy pudełkowe i histogramy oraz macierz korelacji zawierającą współczynniki korelacji Pearsona. Wszystkie prace wykonano z użyciem języka programowania R. Język R, z wykorzystaniem programu RStudio, jest bardzo wygodnym i szybkim narzędziem do statystycznej analizy danych. Przy użyciu tego języka uzyskanie wymienionych powyżej wykresów, tabeli i danych jest szybkie i stosunkowo łatwe. Wyniki analiz składu gazu wydają się mało zróżnicowane. Mimo to dzięki algorytmom k-średnich i hierarchicznym możliwe było pogrupowanie danych geochemicznych na wyraźnie rozdzielne zespoły. Zarówno wartości składu izotopowego, jak i skład chemiczny pozwalają wyznaczyć grupy, które w inny sposób nie byłyby dostrzegalne.
In petroleum geology, statistical methods are widely used in petrography, petrophysics, geochemistry, geomechanics, well log analysis and seismics, and cluster analysis is important for rock classification – determination of zones with certain properties, e.g., source or reservoir. This paper presents the use of the R language for statistical analysis, including cluster analysis, of large sets of diverse geochemical data. Literature data from analyses of chemical and isotopic composition of natural gases were used for statistical analyses. The results included the chemical composition of the natural gases and the isotopic composition. So-called unsupervised machine learning algorithms were used to perform the cluster analysis. Clustering was performed using two methods: k-means and hierarchical. A two-dimensional graph (function fviz_cluster) can be used to illustrate the results of the k-means clustering. The dimensions in the graph are the result of principal component analysis (PCA) and are a linear combination of the features (columns in the table). The result of hierarchical clustering is a graph called a dendrogram. The paper additionally presents box plots and histograms as well as a correlation matrix containing Pearson correlation coefficients. All work was completed using the programming language R. The R language, using the RStudio software, is a very convenient and fast tool for statistical data analysis. Obtaining the above-mentioned graphs, tables and data is quick and relatively easy, using the R language. The results of the analyses of the composition of the gas appear to have little variation. Nevertheless, thanks to k-means and hierarchical algorithms, it was possible to group the geochemical data into clearly separable groups. Both the isotopic composition values and the chemical composition make it possible to delineate groups that would not otherwise be noticeable.
Źródło:: Nafta-Gaz; 2023, 79, 9; 576-583
0867-8871
Pojawia się w:: Nafta-Gaz
Dostawca treści:: Biblioteka Nauki

Artykuł

Zmień widok

na półce

Skocz do pozycji: 5.

Tytuł:: Analysis of data quoted on the Day-Ahead Market of TGE S.A. using Statistics and Machine Learning Toolbox
Autorzy:: Tchórzewski, Jerzy
Longota, Bartłomiej
Powiązania:: https://bibliotekanauki.pl/articles/2201615.pdf
Data publikacji:: 2022
Wydawca:: Uniwersytet Przyrodniczo-Humanistyczny w Siedlcach
Tematy:: artificial neural network
cluster analysis
Day-Ahead Market
k-means method
Matlab and Simulink environment
Statistics and Machine Learning Toolbox
Ward’s method
Opis:: The publication contains the results of research in the field of cluster analysis carried out using data quoted on the Day-Ahead Market of TGE S.A. Two methods were used in the analysis, one hierarchical known as the Ward’s method, and the other non-hierarchical - the k-means method. Many interesting research results have been obtained, which are illustrated, among others, in in the form of dendrograms, silhouette graphs and graphs in the form of clusters. Data on the volume and the volumeweighted average price of electricity were examined for various types of quotations: fixing 1, fixing 2 and continuous quotations. The research was carried out in the MATLAB and Simulink environments using a library called Machine and Statistics Learning Toolbox. Selected test results were interpreted.
Źródło:: Studia Informatica : systems and information technology; 2022, 2(27); 49--74
1731-2264
Pojawia się w:: Studia Informatica : systems and information technology
Dostawca treści:: Biblioteka Nauki

Artykuł

Zmień widok

na półce

Skocz do pozycji: 6.

Tytuł:: Efektywność form aktywizacji zawodowej w przekroju wojewódzkim
Effectiveness of forms of professional activisation by voivodships
Autorzy:: Bieszk-Stolorz, Beata
Dmytrów, Krzysztof
Powiązania:: https://bibliotekanauki.pl/articles/543962.pdf
Data publikacji:: 2018-12-28
Wydawca:: Główny Urząd Statystyczny
Tematy:: bezrobocie rejestrowane
formy aktywizacji zawodowej
efektywność zatrudnieniowa i kosztowa
analiza skupień
metoda k-średnich
registered unemployment
forms of professional activisation
cost and employment effectiveness
cluster analysis
k-means method
Opis:: Celem artykułu jest ocena zróżnicowania województw ze względu na wartości efektywności kosztowej i zatrudnieniowej podstawowych form aktywizacji zawodowej realizowanej przez powiatowe urzędy pracy w latach 2008—2016. W badaniu wykorzystano dane zawarte w publikacjach Ministerstwa Rodziny, Pracy i Polityki Społecznej. Grupowania dokonano metodą k-średnich. W badanym okresie współczynniki efektywności kosztowej (poza dużym spadkiem w roku 2011) oraz zatrudnieniowej miały tendencję wzrostową. Otrzymano trzy jednorodne grupy. Pierwszą utworzyły województwa o najkorzystniejszych wielkościach efektywności, drugą — województwa o średnich wielkościach efektywności, a trzecią — o wartościach najmniej korzystnych.
The objective of the article is the assessment of the diversity of voivodships with respect to values of cost and employment effectiveness of basic forms of professional activation, implemented by the powiat labour offices in the years 2008—2016. The data source were the publications of The Ministry of Family, Labour and Social Policy. The k-means method was used for clustering. In the analysed period it can be observed that the coefficients of cost (except for substantial decline in 2011) and employment effectiveness had an increasing trend. The three homogeneous groups of voivodships were obtained. The first group consisted of voivodships with the most advantageous values of effectiveness, the second one — with the average values of effectiveness and the third one — the most disadvantageous.
Źródło:: Wiadomości Statystyczne. The Polish Statistician; 2018, 63, 12; 57-74
0043-518X
Pojawia się w:: Wiadomości Statystyczne. The Polish Statistician
Dostawca treści:: Biblioteka Nauki

Artykuł

Zmień widok

na półce

Informacja

Wyszukujesz frazę "K-means method" wg kryterium: Temat

Źródło danych

Dostawca treści

Kolekcja

Rok wydania

Wydawca

Temat

Autor

Typ dokumentu

Język