Temat: K-means algorithm - Katalog OPAC zbiorów

Skocz do pozycji: 1.

Tytuł:: Methods for imputation of missing values and their influence on the results of segmentation research
Metody uzupełniania braków danych i ich wpływ na wyniki badań segmentacyjnych.
Autorzy:: Gąsior, Marcin
Skowron, Łukasz
Powiązania:: https://bibliotekanauki.pl/articles/425241.pdf
Data publikacji:: 2016
Wydawca:: Wydawnictwo Uniwersytetu Ekonomicznego we Wrocławiu
Tematy:: missing values
cluster analysis
k-means algorithm
k-medoids algorithm
Opis:: The lack of answers is a common problem in all types of research, especially in the field of social sciences. Hence a number of solutions were developed, including the analysis of complete cases or imputations that supplement the missing value with a value calculated according to different algorithms. This paper evaluates the influence of the adopted method for the supplementation of missing answers regarding the result of segmentation conducted with the use of cluster analysis. In order to achieve this we used a set of data from an actual consumer research in which the cases with missing values were deleted or supplemented with the use of various methods. Cluster analyses were then performed on those sets of data, both with the assumption of ordinal and ratio level of measurement, and then the grouping quality, as expressed by different indicators, was evaluated. This research proved the advantage of imputation over the analysis of complete cases, it also proved the validity of using more complex approaches than the simple supplementation with an average or median value.
Źródło:: Econometrics. Ekonometria. Advances in Applied Data Analytics; 2016, 4 (54); 61-71
1507-3866
Pojawia się w:: Econometrics. Ekonometria. Advances in Applied Data Analytics
Dostawca treści:: Biblioteka Nauki

Artykuł

Zmień widok

na półce

Skocz do pozycji: 2.

Tytuł:: The number of clusters in hybrid predictive models: does it really matter?
Autorzy:: Łapczyński, Mariusz
Jefmański, Bartłomiej
Powiązania:: https://bibliotekanauki.pl/articles/1046637.pdf
Data publikacji:: 2020
Wydawca:: Główny Urząd Statystyczny
Tematy:: hybrid predictive model
k-means algorithm
decision trees
Opis:: For quite a long time, research studies have attempted to combine various analytical tools to build predictive models. It is possible to combine tools of the same type (ensemble models, committees) or tools of different types (hybrid models). Hybrid models are used in such areas as customer relationship management (CRM), web usage mining, medical sciences, petroleum geology and anomaly detection in computer networks. Our hybrid model was created as a sequential combination of a cluster analysis and decision trees. In the first step of the procedure, objects were grouped into clusters using the k-means algorithm. The second step involved building a decision tree model with a new independent variable that indicated which cluster the objects belonged to. The analysis was based on 14 data sets collected from publicly accessible repositories. The performance of the models was assessed with the use of measures derived from the confusion matrix, including the accuracy, precision, recall, F-measure, and the lift in the first and second decile. We tried to find a relationship between the number of clusters and the quality of hybrid predictive models. According to our knowledge, similar studies have not been conducted yet. Our research demonstrates that in some cases building hybrid models can improve the performance of predictive models. It turned out that the models with the highest performance measures require building a relatively large number of clusters (from 9 to 15).
Źródło:: Przegląd Statystyczny; 2019, 66, 3; 228-238
0033-2372
Pojawia się w:: Przegląd Statystyczny
Dostawca treści:: Biblioteka Nauki

Artykuł

Zmień widok

na półce

Skocz do pozycji: 3.

Tytuł:: Extending k-means with the description comes first approach
Autorzy:: Stefanowski, J.
Weiss, D.
Powiązania:: https://bibliotekanauki.pl/articles/970926.pdf
Data publikacji:: 2007
Wydawca:: Polska Akademia Nauk. Instytut Badań Systemowych PAN
Tematy:: document clustering
cluster labels
k-means algorithm
information retrieval
Opis:: This paper describes a technique for clustering large collections of short and medium length text documents such as press articles, news stories and the like. The technique called description comes first (DCF) consists of identification of related document clusters, selection of salient phrases relevant to these clusters and reallocation of documents matching the selected phrases to form final document groups. The advantages of this technique include more comprehensive cluster labels and clearer (more transparent) relationship between cluster labels and their content. We demonstrate the DCF by taking a standard k-means algorithm as a baseline and weaving DCF elements into it; the outcome is the descriptive k-means (DKM) algorithm. The paper goes through technical background explaining how to implement DKM efficiently and ends with the description of an experiment measuring clustering quality on a benchmark document collection 20-newsgroups. Short fragments of this paper appeared at the poster session of the RIAO 2007 conference, Pittsburgh, PA, USA (electronic proceedings only).
Źródło:: Control and Cybernetics; 2007, 36, 4; 1009-1035
0324-8569
Pojawia się w:: Control and Cybernetics
Dostawca treści:: Biblioteka Nauki

Artykuł

Zmień widok

na półce

Skocz do pozycji: 4.

Tytuł:: K-Means and Fuzzy based Hybrid Clustering Algorithm for WSN
Autorzy:: Angadi, Basavaraj M.
Kakkasageri, Mahabaleshwar S.
Powiązania:: https://bibliotekanauki.pl/articles/27311955.pdf
Data publikacji:: 2023
Wydawca:: Polska Akademia Nauk. Czasopisma i Monografie PAN
Tematy:: wireless sensor networks
cluster
K-Means algorithm
fuzzy logic
Opis:: Wireless Sensor Networks (WSN) acquired a lot of attention due to their widespread use in monitoring hostile environments, critical surveillance and security applications. In these applications, usage of wireless terminals also has grown significantly. Grouping of Sensor Nodes (SN) is called clustering and these sensor nodes are burdened by the exchange of messages caused due to successive and recurring re-clustering, which results in power loss. Since most of the SNs are fitted with nonrechargeable batteries, currently researchers have been concentrating their efforts on enhancing the longevity of these nodes. For battery constrained WSN concerns, the clustering mechanism has emerged as a desirable subject since it is predominantly good at conserving the resources especially energy for network activities. This proposed work addresses the problem of load balancing and Cluster Head (CH) selection in cluster with minimum energy expenditure. So here, we propose hybrid method in which cluster formation is done using unsupervised machine learning based kmeans algorithm and Fuzzy-logic approach for CH selection.
Źródło:: International Journal of Electronics and Telecommunications; 2023, 69, 4; 793--801
2300-1933
Pojawia się w:: International Journal of Electronics and Telecommunications
Dostawca treści:: Biblioteka Nauki

Artykuł

Zmień widok

na półce

Skocz do pozycji: 5.

Tytuł:: Geodesic distances for clustering linked text data
Autorzy:: Tekir, S.
Mansmann, F.
Keimer, D.
Powiązania:: https://bibliotekanauki.pl/articles/91737.pdf
Data publikacji:: 2012
Wydawca:: Społeczna Akademia Nauk w Łodzi. Polskie Towarzystwo Sieci Neuronowych
Tematy:: clustering
geodesic distance
text data
k-means algorithm
cosine distance
k-harmonic means
microprecision values
Opis:: The quality of a clustering not only depends on the chosen algorithm and its parameters, but also on the definition of the similarity of two respective objects in a dataset. Applications such as clustering of web documents is traditionally built either on textual similarity measures or on link information. Due to the incompatibility of these two information spaces, combining these two information sources in one distance measure is a challenging issue. In this paper, we thus propose a geodesic distance function that combines traditional similarity measures with link information. In particular, we test the effectiveness of geodesic distances as similarity measures under the space assumption of spherical geometry in a 0-sphere. Our proposed distance measure is thus a combination of the cosine distance of the term-document matrix and some curvature values in the geodesic distance formula. To estimate these curvature values, we calculate clustering coefficient values for every document from the link graph of the data set and increase their distinctiveness by means of a heuristic as these clustering coefficient values are rough estimates of the curvatures. To evaluate our work, we perform clustering tests with the k-means algorithm on a subset of the EnglishWikipedia hyperlinked data set with both traditional cosine distance and our proposed geodesic distance. Additionally, taking inspiration from the unified view of the performance functions of k-means and k-harmonic means, min and harmonic average of the cosine and geodesic distances are taken in order to construct alternate distance forms. The effectiveness of our approach is measured by computing microprecision values of the clusters based on the provided categorical information of each article.
Źródło:: Journal of Artificial Intelligence and Soft Computing Research; 2012, 2, 3; 247-258
2083-2567
2449-6499
Pojawia się w:: Journal of Artificial Intelligence and Soft Computing Research
Dostawca treści:: Biblioteka Nauki

Artykuł

Zmień widok

na półce

Skocz do pozycji: 6.

Tytuł:: Wykrywanie defektów z wykorzystaniem termografii aktywnej i algorytmu k-średnich
Detection of Defects Using Active Thermography and k-Means Algorithm
Autorzy:: Dudzik, Sebastian
Powiązania:: https://bibliotekanauki.pl/articles/275938.pdf
Data publikacji:: 2019
Wydawca:: Sieć Badawcza Łukasiewicz - Przemysłowy Instytut Automatyki i Pomiarów
Tematy:: algorytm k-średnich
wykrywanie defektów
termografia aktywna
k-means algorithm
defect detection
active thermography
Opis:: W pracy przedstawiono nową metodę wykrywania defektów materiałowych z wykorzystaniem termografii aktywnej. W celu zwiększenia kontrastu cieplnego dokonano przetwarzania wstępnego zarejestrowanej sekwencji termogramów metodami morfologii matematycznej. Do wykrywania defektów zastosowano algorytm k-średnich. W pracy zbadano wpływ miary odległości używanej w opisywanym algorytmie oraz doboru danych wejściowych na efektywność opisywanej metody. Eksperyment przeprowadzono dla próbki wykonanej z kompozytu zbrojonego włóknem węglowym (CFRP). W badaniach stwierdzono, że najmniejsze błędy wykrywania defektów za pomocą opisywanej metody uzyskuje się dla kwadratowej odległości euklidesowej.
The paper presents a new method of detecting material defects using active thermography. In order to increase the thermal contrast, preprocessing of the recorded sequence of thermograms was carried out using mathematical morphology methods. The k-means algorithm was used to detect defects. The work examined the impact of distance measure used in the described algorithm and the selection of input data on the effectiveness of the described method. The experiment was carried out for a sample made of carbon fiber reinforced composite (CFRP). Studies have shown that the smallest errors in defect detection using the described method are obtained for the square Euclidean distance.
Źródło:: Pomiary Automatyka Robotyka; 2019, 23, 3; 11-15
1427-9126
Pojawia się w:: Pomiary Automatyka Robotyka
Dostawca treści:: Biblioteka Nauki

Artykuł

Zmień widok

na półce

Skocz do pozycji: 7.

Tytuł:: The PROMETHEE II method in multi-criteria evaluation of cryptocurrency exchanges
Metoda PROMETHEE II w wielokryterialnej ocenie giełd kryptowalut
Autorzy:: Kądziołka, K.
Powiązania:: https://bibliotekanauki.pl/articles/2048732.pdf
Data publikacji:: 2021
Wydawca:: Akademia Bialska Nauk Stosowanych im. Jana Pawła II w Białej Podlaskiej
Tematy:: k-means algorithm
hierarchical clustering
cryptocurrency exchanges
composite indicator
weighting scheme
PROMETHEE II
Opis:: Subject and purpose of work: The aim of this work is to present the application possibilities of PROMETHEE II method used to create a ranking of cryptocurrency exchanges as well as comparing the results of multi-criteria and multi-dimensional analysis. A simulation method for determining the weights of criteria is proposed, which maximizes the similarity of the final ranking to the other ones. Materials and methods: PROMETHEE II method and taxonomic measure were used to create rankings of exchanges. Hierarchical clustering combined with the k-means algorithm was used to identify groups of exchanges with a similar level of the values of net flows. Publicly available data published on the Internet were analysed. Results: There was a high consistency in the ordering of exchanges when a multi-criteria and a multi-dimensional approach were used. Four groups of exchanges with a similar level of the values of net flows were identified. Exchanges in group one were characterized by the highest average net flows. Conclusions: The multi-criteria approach can be used as an alternative to the multi-dimensional assessment of cryptocurrency exchanges. The proposed simulation method for determining the weights of criteria can be helpful in case the researcher has no information about the importance of the criteria.
Źródło:: Economic and Regional Studies; 2021, 14, 2; 131-145
2083-3725
2451-182X
Pojawia się w:: Economic and Regional Studies
Dostawca treści:: Biblioteka Nauki

Artykuł

Zmień widok

na półce

Skocz do pozycji: 8.

Tytuł:: DSMK-means “density-based split-and-Merge K-means clustering algorithm
Autorzy:: Aldahdooh, R. T.
Ashour, W.
Powiązania:: https://bibliotekanauki.pl/articles/91719.pdf
Data publikacji:: 2013
Wydawca:: Społeczna Akademia Nauk w Łodzi. Polskie Towarzystwo Sieci Neuronowych
Tematy:: clustering
K-means
Density-based Split
Merge K-means clustering Algorithm
DSMK-means
clustering algorithm
Opis:: Clustering is widely used to explore and understand large collections of data. K-means clustering method is one of the most popular approaches due to its ease of use and simplicity to implement. This paper introduces Density-based Split- and -Merge K-means clustering Algorithm (DSMK-means), which is developed to address stability problems of standard K-means clustering algorithm, and to improve the performance of clustering when dealing with datasets that contain clusters with different complex shapes and noise or outliers. Based on a set of many experiments, this paper concluded that developed algorithms “DSMK-means” are more capable of finding high accuracy results compared with other algorithms especially as they can process datasets containing clusters with different shapes, densities, or those with outliers and noise.
Źródło:: Journal of Artificial Intelligence and Soft Computing Research; 2013, 3, 1; 51-71
2083-2567
2449-6499
Pojawia się w:: Journal of Artificial Intelligence and Soft Computing Research
Dostawca treści:: Biblioteka Nauki

Artykuł

Zmień widok

na półce

Skocz do pozycji: 9.

Tytuł:: Recognition of pathological states in arterial blood by distance based techniques
Autorzy:: Sokołowska, B.
Jóźwik, A.
Powiązania:: https://bibliotekanauki.pl/articles/333231.pdf
Data publikacji:: 2003
Wydawca:: Uniwersytet Śląski. Wydział Informatyki i Nauki o Materiałach. Instytut Informatyki. Zakład Systemów Komputerowych
Tematy:: gazometria krwi tętniczej
paraliż przepony
zasada k-NN
klasyfikatory
arterial blood gasometry
paralysis of diaphragm
k-NN rule
classifiers
k-means algorithm
Opis:: The paper presents the application of some distance based pattern recognition algorithms for recognition of pathological states in respiratory system on the basis of the arterial blood gasometry (features pH, pCO2, pO2). In our biological model two experimental situations were considered: 1) the intact animals and 2) the main inspiratory muscles paralyzed (after acute of bilateral phrenicotomy). The comparison of the mentioned three features in the two conditions was the main goal of the present study. The analyzed biological data set contained 38 in class 1 (muscle function preserved) and 36 in class 2 (after diaphragm paralyzed) measurements. It was discovered that a significant part of the measurements could be correctly recognized as the ones coming from the first or the second class according to gasometric measurements.
Źródło:: Journal of Medical Informatics & Technologies; 2003, 5; MI23-30
1642-6037
Pojawia się w:: Journal of Medical Informatics & Technologies
Dostawca treści:: Biblioteka Nauki

Artykuł

Zmień widok

na półce

Skocz do pozycji: 10.

Tytuł:: Porównanie wydajności algorytmu k-means zaimplementowanego w języku X10 i środowisku C++/MPI
Performance comparison of the k-means algorithm implemented in the X10 programming language and the C++/MPI environment
Autorzy:: Wyrzykowski, R.
Karoń, T.
Powiązania:: https://bibliotekanauki.pl/articles/91405.pdf
Data publikacji:: 2016
Wydawca:: Warszawska Wyższa Szkoła Informatyki
Tematy:: algorytm k-średnich
język programowania X10
środowisko C++/MPI
porównanie
k-means algorithm
X10 programming language
C++/MPI environment
comparison
Opis:: W pracy opisano algorytm k-średnich oraz sposób jego implementacji w języku X10. Dokonano porównania tego rozwiązania z implementacją w języku C++11 z wykorzystaniem standardu MPI. Stwierdzono, że implementacja w języku X10 jest szybsza przy większej liczbie procesorów realizujących obliczenia niż implementacja w środowisku C++/MPI. Kod zapisany w języku X10 jest o 59% krótszy od kodu dla kombinacji C++/MPI.
In this work the k-means algorithm and the way of its implementation in the X10 programming language are described. The achieved results are compared with the implementation of the same algorithm in the C++11 programming language using the MPI standard. It was confirmed that the implementation in the X10 programming language is faster on a large number of processors than the implementation in the C++/MPI environment. Additionally, the X10 code is about 59% shorter than the code for the C++/MPI combination.
Źródło:: Zeszyty Naukowe Warszawskiej Wyższej Szkoły Informatyki; 2016, 10, 14; 7-35
1896-396X
2082-8349
Pojawia się w:: Zeszyty Naukowe Warszawskiej Wyższej Szkoły Informatyki
Dostawca treści:: Biblioteka Nauki

Artykuł

Zmień widok

na półce

Skocz do pozycji: 11.

Tytuł:: Novel Colour Clustering Method for Interlaced Multi-colored Dyed Yarn Woven Fabrics
Nowa metoda określania łączenia kolorów dla tkanin wykonanych z przeplatanych przędz barwionych
Autorzy:: Zhang, J.
Xin, B.
Shen, C.
Fang, H.
Cao, Y.
Powiązania:: https://bibliotekanauki.pl/articles/232909.pdf
Data publikacji:: 2015
Wydawca:: Sieć Badawcza Łukasiewicz - Instytut Biopolimerów i Włókien Chemicznych
Tematy:: colour clustering
Lab colour space
K-means algorithm
dyed yarn woven fabrics
image analysis
łączenie kolorów
system kolorystyczny Lab
przeplatana przędza barwiona
algorytm łączenia
analiza obrazu
Opis:: In this paper, a novel colour clustering method based on the K-means clustering algorithm is developed for interlaced multi-coloured dyed yarn woven fabrics which can be used to sort the colour of the dyed yarn for the development of a quick response fabric system. Firstly fabric images captured by a flat scanner could be decomposed into three sub-images in red, green and blue channels, respectively. Secondly median filters with different template sizes were selected to process the sub-images in the three color channels separately. Thirdly filtered images in the RGB colour space, reconstructed from the three sub-images, can be converted into the Lab colour format. Ultimately the results of colour segmentation and classification can be obtained based on the Lab color space using the improved Kmeans clustering algorithms. Our experimental results indicated that our method proposed works better than the conventional method based on subjective and manual operations with the aid of simple tools in terms of both accuracy and robustness.
Pokazano opracowanie nowej metody określania łączenia kolorów, opartej na algorytmach uzyskiwania wartości średnich mających zastosowanie przy wielokolorowych przędzach przeplatanych w tkaninach. Metoda może być stosowana przy określaniu kolorów barwionych przędz, aby uzyskać szybką odpowiedź barwy dla różnego rodzaju tkaniny. Wstępnie obrazy tkaniny uzyskane z płaskiego skanera mogą być zdekomponowane w trzy sub-obrazy w kanałach czerwonym, zielonym i niebieskim, następnie filtry uśredniające o zróżnicowanych wymiarach wzorców zostają wybrane dla obróbki sub-obrazów niezależnie w trzech kanałach barwnych. Po tym przefiltrowane obrazy w przestrzeni RGB są rekonstruowane w tych trzech kanałach i mogą być przetworzone w systemie kolorystycznym Lab. W końcu wyniki segmentacji kolorów i klasyfikacji mogą być uzyskane, bazując na przestrzeni kolorystycznej Lab przy zastosowaniu poprawionego algorytmu łączenia. Wyniki eksperymentalne wskazują, że zaproponowana metoda daje możliwość uzyskania lepszych rezultatów niż metoda konwencjonalna oparta o subiektywne, ręczne operacje z zastosowaniem prostych narzędzi.
Źródło:: Fibres & Textiles in Eastern Europe; 2015, 3 (111); 107-114
1230-3666
2300-7354
Pojawia się w:: Fibres & Textiles in Eastern Europe
Dostawca treści:: Biblioteka Nauki

Artykuł

Zmień widok

na półce

Skocz do pozycji: 12.

Tytuł:: Zastosowanie wybranych metod taksonomicznych i prospektywnych w polityce oraz strategicznym zarządzaniu publicznym
The use of selected taxonomic and foresight methods in policy making and strategic public management
Autorzy:: Baron, Marcin
Ochojski, Artur
Polko, Adam
Warzecha, Katarzyna
Powiązania:: https://bibliotekanauki.pl/articles/593014.pdf
Data publikacji:: 2015
Wydawca:: Uniwersytet Ekonomiczny w Katowicach
Tematy:: Metoda delficka
Metoda k-średnich
Metoda warda
Prowadzenie polityki
Rozwój lokalny
Rozwój regionalny
Usługi publiczne
Zarządzanie strategiczne
Delphi method
K-means algorithm
Local development
Policy making
Public services
Regional development
Strategic management
Ward’s method
Opis:: W artykule prezentowane jest podejście metodyczne do prowadzenia kompleksowych analiz na potrzeby polityki i strategicznego zarządzania publicznego. Do identyfikacji podobnych regionów, pomiędzy którymi mogą wystąpić efekty uczenia się, zaproponowano metody taksonomiczne, natomiast metodę delficką wskazano jako odpowiednią dla lepszego zrozumienia przyszłości usług publicznych oraz zwiększenia zdolności adaptacyjnych na różnych poziomach sprawowania władzy. Metody zostały zilustrowane przykładami zastosowań, zaczerpniętymi z dorobku projektu ADAPT2DC („New innovative solutions to adapt governance and management of public infrastructures to demographic change”).
The paper aims at presenting method approach to complex analysis in support of policy making and strategic public management. Taxonomic methods are used to identify similar regions that can benefit of mutual learning and improve their performance in public service delivery. Delphi study is proposed to better understand public service futures and increase adaptation capacities on different levels of governance. The methods are illustrated with samples of ADAPT2DC project („New innovative solutions to adapt governance and management of public infrastructures to demographic change”) works and reflections concerning their application.
Źródło:: Studia Ekonomiczne; 2015, 233; 56-72
2083-8611
Pojawia się w:: Studia Ekonomiczne
Dostawca treści:: Biblioteka Nauki

Artykuł

Zmień widok

na półce

Skocz do pozycji: 13.

Tytuł:: On a book Algorithms for data science by Brian Steele, John Chandler and Swarn Reddy
Autorzy:: Szajowski, Krzysztof J.
Powiązania:: https://bibliotekanauki.pl/articles/747695.pdf
Data publikacji:: 2017
Wydawca:: Polskie Towarzystwo Matematyczne
Tematy:: histogram
algorytm centroidów
Algorithms
Associative Statistics
Computation
Computing Similarity
Cluster Analysis
Correlation
Data Reduction
Data Mapping
Data Dictionary
Data Visualization
Forecasting
Hadoop
Histogram
k-Means Algorithm
k-Nearest Neighbor Prediction
Algorytmy
miary zależności
obliczenia
analiza skupień
korelacja
redukcja danych
transformacja danych
wizualizacja danych
prognozowanie
algorytm k-średnich
algorytm k najbliższych sąsiadów
Opis:: Przedstawiona tutaj pozycja wydawnicza jest obszernym wprowadzeniem do najważniejszych podstawowych zasad, algorytmów i danych wraz zestrukturami, do których te zasady i algorytmy się odnoszą. Przedstawione zaganienia są wstępem do rozważań w dziedzinie informatyki. Jednakże, to algorytmy są podstawą analityki danych i punktem skupienia tego podręcznika. Pozyskiwanie wiedzy z danych wymaga wykorzystania metod i rezultatów z co najmniej trzech dziedzin: matematyki, statystyki i informatyki. Książka zawiera jasne i intuicyjne objaśnienia matematyczne i statystyczne poszczególnych zagadnień, przez co algorytmy są naturalne i przejrzyste. Praktyka analizy danych wymaga jednak więcej niż tylko dobrych podstaw naukowych, ścisłości matematycznej i spojrzenia od strony metodologii statystycznej. Zagadnienia generujące dane są ogromnie zmienne, a dopasowanie metod pozyskiwania wiedzy może być przeprowadzone tylko w najbardziej podstawowych algorytmach. Niezbędna jest płynność programowania i doświadczenie z rzeczywistymi problemami. Czytelnik jest prowadzony przez zagadnienia algorytmiczne z wykorzystaniem Pythona i R na bazie rzeczywistych problemów i analiz danych generowanych przez te zagadnienia. Znaczną część materiału zawartego w książce mogą przyswoić również osoby bez znajomości zaawansowanej metodologii. To powoduje, że książka może być przewodnikiem w jedno lub dwusemestralnym kursie analityki danych dla studentów wyższych lat studiów matematyki, statystyki i informatyki. Ponieważ wymagana wiedza wstępna nie jest zbyt obszerna, studenci po kursie z probabilistyki lub statystyki, ze znajomością podstaw algebry i analizy matematycznej oraz po kurs programowania nie będą mieć problemów, tekst doskonale nadaje się także do samodzielnego studiowania przez absolwentów kierunków ścisłych. Podstawowy materiał jest dobrze ilustrowany obszernymi zagadnieniami zaczerpniętymi z rzeczywistych problemów. Skojarzona z książką strona internetowa wspiera czytelnika danymi wykorzystanymi w książce, a także prezentacją wybranych fragmentów wykładu. Jestem przekonany, że tematem książki jest nowa dziedzina nauki.
The book under review gives a comprehensive presentation of data science algorithms, which means on practical data analytics unites fundamental principles, algorithms, and data. Algorithms are the keystone of data analytics and the focal point of this textbook. The data science, as the authors claim, is the discipline since 2001. However, informally it worked before that date (cf. Cleveland(2001)). The crucial role had the graphic presentation of the data as the visualization of the knowledge hidden in the data. It is the discipline which covers the data mining as the tool or important topic. The escalating demand for insights into big data requires a fundamentally new approach to architecture, tools, and practices. It is why the term data science is useful. It underscores the centrality of data in the investigation because they store of potential value in the field of action. The label science invokes certain very real concepts within it, like the notion of public knowledge and peer review. This point of view makes that the data science is not a new idea. It is part of a continuum of serious thinking dates back hundreds of years. The good example of results of data science is the Benford law (see Arno Berger and Theodore P. Hill(2015, 2017). In an effort to identifying some of the best-known algorithms that have been widely used in the data mining community, the IEEE International Conference on Data Mining (ICDM) has identified the top 10 algorithms in data mining for presentation at ICDM '06 in Hong Kong. This panel will announce the top 10 algorithms and discuss the impact and further research of each of these 10 algorithms in 2006. In the present book, there are clear and intuitive explanations of the mathematical and statistical foundations make the algorithms transparent. Most of the algorithms announced by IEEE in 2006 are included. But practical data analytics requires more than just the foundations. Problems and data are enormously variable and only the most elementary of algorithms can be used without modification. Programming fluency and experience with real and challenging data are indispensable and so the reader is immersed in Python and R and real data analysis. By the end of the book, the reader will have gained the ability to adapt algorithms to new problems and carry out innovative analysis.
Źródło:: Mathematica Applicanda; 2017, 45, 2
1730-2668
2299-4009
Pojawia się w:: Mathematica Applicanda
Dostawca treści:: Biblioteka Nauki

Artykuł

Zmień widok

na półce

Skocz do pozycji: 14.

Tytuł:: Geometric characteristics of Iraq’s raster topographic maps used for automatic updating the road network
Autorzy:: Abdallah, R.
Powiązania:: https://bibliotekanauki.pl/articles/100558.pdf
Data publikacji:: 2015
Wydawca:: Uniwersytet Rolniczy im. Hugona Kołłątaja w Krakowie
Tematy:: maps update
road networks segmentation
topographic raster map
tracking algorithm
scanning algorithm
methods of image binarization
k-means method
mapa
mapa topograficzna
aktualizacja
Opis:: This paper is devoted to the problem of road network extraction from raster image. The task of road network extraction is formulated in common view. The approach to the road map extraction has been proposed which can be applied for topographic map updating and is based on image clustering by k-means method and on application of scanning algorithm for extraction of road network fragments. Road map description is formed as set of linear fragments with knowing parameters. These linear fragments are created by merging of smaller parts. Experimental researches were implemented for maps of 10 Iraq cities. Experimental results show in average the extraction precision of 86% (in comparison with human expert).
Źródło:: Geomatics, Landmanagement and Landscape; 2015, 3; 7-18
2300-1496
Pojawia się w:: Geomatics, Landmanagement and Landscape
Dostawca treści:: Biblioteka Nauki

Artykuł

Zmień widok

na półce

Skocz do pozycji: 15.

Tytuł:: Lung cancer detection using an integration of fuzzy K-Means clustering and deep learning techniques for CT lung images
Autorzy:: Prasad, J. Maruthi Nagendra
Chakravarty, S.
Krishna, M. Vamsi
Powiązania:: https://bibliotekanauki.pl/articles/2173683.pdf
Data publikacji:: 2022
Wydawca:: Polska Akademia Nauk. Czytelnia Czasopism PAN
Tematy:: fuzzy K-means
artificial neural networks
SVM
support vector machine
crow search optimization algorithm
algorytm rozmytych k-średnich
sztuczne sieci neuronowe
maszyna wektorów wspierających
algorytm optymalizacji wyszukiwania kruków
Opis:: Computer aided detection systems are used for the provision of second opinion during lung cancer diagnosis. For early-stage detection and treatment false positive reduction stage also plays a vital role. The main motive of this research is to propose a method for lung cancer segmentation. In recent years, lung cancer detection and segmentation of tumors is considered one of the most important steps in the surgical planning and medication preparations. It is very difficult for the researchers to detect the tumor area from the CT (computed tomography) images. The proposed system segments lungs and classify the images into normal and abnormal and consists of two phases, The first phase will be made up of various stages like pre-processing, feature extraction, feature selection, classification and finally, segmentation of the tumor. Input CT image is sent through the pre-processing phase where noise removal will be taken care of and then texture features are extracted from the pre-processed image, and in the next stage features will be selected by making use of crow search optimization algorithm, later artificial neural network is used for the classification of the normal lung images from abnormal images. Finally, abnormal images will be processed through the fuzzy K-means algorithm for segmenting the tumors separately. In the second phase, SVM classifier is used for the reduction of false positives. The proposed system delivers accuracy of 96%, 100% specificity and sensitivity of 99% and it reduces false positives. Experimental results shows that the system outperforms many other systems in the literature in terms of sensitivity, specificity, and accuracy. There is a great tradeoff between effectiveness and efficiency and the proposed system also saves computation time. The work shows that the proposed system which is formed by the integration of fuzzy K-means clustering and deep learning technique is simple yet powerful and was effective in reducing false positives and segments tumors and perform classification and delivers better performance when compared to other strategies in the literature, and this system is giving accurate decision when compared to human doctor’s decision.
Źródło:: Bulletin of the Polish Academy of Sciences. Technical Sciences; 2022, 70, 3; art. no. e139006
0239-7528
Pojawia się w:: Bulletin of the Polish Academy of Sciences. Technical Sciences
Dostawca treści:: Biblioteka Nauki

Artykuł

Zmień widok

na półce

Informacja

Wyszukujesz frazę "K-means algorithm" wg kryterium: Temat

Źródło danych

Dostawca treści

Kolekcja

Rok wydania

Wydawca

Temat

Autor

Typ dokumentu

Język