Temat: Imputation - Katalog OPAC zbiorów

Skocz do pozycji: 1.

Tytuł:: Podstawy matematyczne technik imputacyjnych
Basic mathematical imputation techniques
Математические основы импутационных методов
Autorzy:: Wesołowski, Jacek
Tarczyński, Jakub
Powiązania:: https://bibliotekanauki.pl/articles/542245.pdf
Data publikacji:: 2016-09
Wydawca:: Główny Urząd Statystyczny
Tematy:: imputacja
imputacja wielokrotna
estymator imputacyjny
estymator Rubina
imputacja średnią
imputacja typu hot-deck
imputacja regresyjna
imputation
multiple imputation
imputation estimator
Rubin estimator
mean imputation
hot-deck imputation
regression imputation
импутация
многократная импутация
импутационная оценка
оценка Рубина
импутация среднем
импутация типа hot-deck
регрессионная импутация
Opis:: W artykule przedstawiono podstawy metodologii imputacyjnej (w tym metodologii wielokrotnej imputacji), koncentrując się na wyjaśnieniu matematycznej strony zagadnień. Analizowano sytuację, gdy obserwacje tworzące pierwotną próbkę są niezależnymi zmiennymi losowymi o jednakowym rozkładzie, a braki odpowiedzi pojawiają się losowo w sposób niezależny od obserwacji. W szczególności wskazano na problemy pojawiające się, gdy w imputacji wielokrotnej stosowany jest standardowy estymator Rubina wariancji estymatora wielokrotnej imputacji i wskazano na możliwe ulepszenie tego popularnego estymatora. Punktem wyjścia analiz jest sytuacja, gdy za pojawianie się braków odpowiedzi odpowiada mechanizm deterministyczny.
The article presents the basics of imputation methodology (including the methodology of multiple imputation), focusing on understanding its mathematical background. We analyze the situation when observations in the original sample are independent random variables with identical distributions, and response or its lack is modeled by a random mechanism which is independent of observations. In particular, we point out to problems that arise when the standard Rubin estimate of the multiple imputation variance estimator is used. A possible improvement of this popular estimator is indicated. The starting point of the analysis is when the appearance of response deficiencies is caused by a deterministic mechanism.
В статье представлены основы импутационной методологии (в том числе методологии многократной импутации). Внимание в статье сосредоточено на прояснении математической стороны вопросов. Проанализирована ситуация, когда наблюдения формирующие оригинальную выборку являются независимыми случайными величинами с одинаковыми распределениями, а отсутствие ответов появляется случайно независимо от наблюдения. В частности статья указывает на проблемы, которые возникают когда используется стандартная оценка Рубина дисперсии оценки многократной импутации. В статье указано также на возможное улучшение этой популярной оценки. Отправной точкой анализа является ситуация, когда отсутствие ответов обясняет детерминический механизм.
Źródło:: Wiadomości Statystyczne. The Polish Statistician; 2016, 9; 7-54
0043-518X
Pojawia się w:: Wiadomości Statystyczne. The Polish Statistician
Dostawca treści:: Biblioteka Nauki

Artykuł

Zmień widok

na półce

Skocz do pozycji: 2.

Tytuł:: Bias Reduction of Finite Population Imputation by Kernel Methods
Autorzy:: Pettersson, Nicklas
Powiązania:: https://bibliotekanauki.pl/articles/465881.pdf
Data publikacji:: 2013
Wydawca:: Główny Urząd Statystyczny
Tematy:: bayesian bootstrap
boundary and nonresponse bias missing data
multiple imputation
Pólya urn models
real donor imputation
Opis:: Missing data is a nuisance in statistics. Real donor imputation can be used with item nonresponse. A pool of donor units with similar values on auxiliary variables is matched to each unit with missing values. The missing value is then replaced by a copy of the corresponding observed value from a randomly drawn donor. Such methods can to some extent protect against nonresponse bias. But bias also depends on the estimator and the nature of the data. We adopt techniques from kernel estimation to combat this bias. Motivated by Pólya urn sampling, we sequentially update the set of potential donors with units already imputed, and use multiple imputations via Bayesian bootstrap to account for imputation uncertainty. Simulations with a single auxiliary variable show that our imputation method performs almost as well as competing methods with linear data, but better when data is nonlinear, especially with large samples.
Źródło:: Statistics in Transition new series; 2013, 14, 1; 139-160
1234-7655
Pojawia się w:: Statistics in Transition new series
Dostawca treści:: Biblioteka Nauki

Artykuł

Zmień widok

na półce

Skocz do pozycji: 3.

Tytuł:: Review of methods for data sets with missing values and practical applications
Autorzy:: Korczyński, Adam
Powiązania:: https://bibliotekanauki.pl/articles/433946.pdf
Data publikacji:: 2014
Wydawca:: Wydawnictwo Uniwersytetu Ekonomicznego we Wrocławiu
Tematy:: missing data pattern
missing data mechanism
complete-case analysis
available-case analysis
single imputation
likelihood-based methods
multiple imputation
weighting methods
Opis:: The aim of this paper is to revise the traditional methods (complete-case analysis, available-case analysis, single imputation) and current methods (likelihood-based methods, multiple imputation, weighting methods) for handling the problem of missing data and to assess their usefulness in statistical research. The paper provides the terminology and the description of traditional and current methods and algorithms used in the analysis of incomplete data sets. The methods are assessed in terms of the statistical properties of their estimators. An example is provided for the multiple imputation method. The review indicates that current methods outweigh traditional ones in terms of bias reduction, precision and efficiency of the estimation.
Źródło:: Śląski Przegląd Statystyczny; 2014, 12(18); 83-104
1644-6739
Pojawia się w:: Śląski Przegląd Statystyczny
Dostawca treści:: Biblioteka Nauki

Artykuł

Zmień widok

na półce

Skocz do pozycji: 4.

Tytuł:: Hybrid multiple imputation in a large scale complex survey
Autorzy:: Razzak, Humera
Heumann, Christian
Powiązania:: https://bibliotekanauki.pl/articles/1186925.pdf
Data publikacji:: 2019-12-10
Wydawca:: Główny Urząd Statystyczny
Tematy:: complex surveys
high-dimensional data
missing data
multiple imputation
Opis:: Large-scale complex surveys typically contain a large number of variables measured on an even larger number of respondents. Missing data is a common problem in such surveys. Since usually most of the variables in a survey are categorical, multiple imputation requires robust methods for modelling highdimensional categorical data distributions. This paper introduces the 3-stage Hybrid Multiple Imputation (HMI) approach, computationally efficient and easy to implement, to impute complex survey data sets that contain both continuous and categorical variables. The proposed HMI approach involves the application of sequential regression MI techniques to impute the continuous variables by using information from the categorical variables, already imputed by a non-parametric Bayesian MI approach. The proposed approach seems to be a good alternative to the existing approaches, frequently yielding lower root mean square errors, empirical standard errors and standard errors than the others. The HMI method has proven to be markedly superior to the existing MI methods in terms of computational efficiency. The authors illustrate repeated sampling properties of the hybrid approach using simulated data. The results are also illustrated by child data from the multiple indicator survey (MICS) in Punjab 2014.
Źródło:: Statistics in Transition new series; 2019, 20, 4; 33-58
1234-7655
Pojawia się w:: Statistics in Transition new series
Dostawca treści:: Biblioteka Nauki

Artykuł

Zmień widok

na półce

Skocz do pozycji: 5.

Tytuł:: Comparison of Selected Multiple Imputation Methods for Continuous Variables – Preliminary Simulation Study Results
Porównanie wybranych metod imputacji wielokrotnej dla zmiennych ilościowych – wstępne wyniki badań symulacyjnych
Autorzy:: Misztal, Małgorzata Aleksandra
Powiązania:: https://bibliotekanauki.pl/articles/656755.pdf
Data publikacji:: 2018
Wydawca:: Uniwersytet Łódzki. Wydawnictwo Uniwersytetu Łódzkiego
Tematy:: dane niekompletne
imputacja wielokrotna
analiza głównych składowych
missForest
incomplete data
multiple imputation
principal component analysis
Opis:: Problem występowania danych niekompletnych i ich wpływu na wyniki analiz statystycznych nie jest związany z żadną konkretną dziedziną nauki – pojawia się w ekonomii, socjologii, edukacji, naukach behawioralnych czy medycynie. W przypadku większości klasycznych metod statystycznych wymagana jest kompletna informacja o zmiennych charakteryzujących badane obiekty, a typowym podejściem do brakujących danych jest po prostu ich usunięcie. Prowadzi to jednak do niewiarygodnych i obciążonych wyników analiz i nie jest zalecane w literaturze przedmiotu. Rekomendowaną metodą postępowania z brakującymi danymi jest imputacja wielokrotna. W artykule rozważono kilka wybranych jej metod. Szczególną uwagę zwrócono na wykorzystanie analizy głównych składowych (PCA) jako metody imputacji. Celem pracy była ocena jakości imputacji opartej na PCA na tle dwóch innych technik uzupełniania braków danych: imputacji wielokrotnej za pomocą równań łańcuchowych (MICE) i metody missForest. Porównania metod imputacji dokonano, wykorzystując podejście symulacyjne i generując braki danych w 10 kompletnych zbiorach danych z repozytorium baz danych Uniwersytetu Kalifornijskiego w Irvine, z uwzględnieniem różnych mechanizmów generowania braków danych oraz różnych proporcji (10–50%) brakujących wartości. Do imputacji brakujących wartości zastosowano metodę równań łańcuchowych, metodę missForest oraz metodę opartą na głównych składowych (MIPCA). Znormalizowany pierwiastek kwadratowy błędu średniokwadratowego (NRMSE) wykorzystano jako miarę dokładności imputacji. Na podstawie przeprowadzonych analiz metoda missForest może być rekomendowana jako ta metoda wielokrotnej imputacji, która zapewnia najwyższą dokładność imputacji braków danych. Imputacja oparta na analizie głównych składowych (PCA) nie prowadzi do zadowalających wyników.
The problem of incomplete data and its implications for drawing valid conclusions from statistical analyses is not related to any particular scientific domain, it arises in economics, sociology, education, behavioural sciences or medicine. Almost all standard statistical methods presume that every object has information on every variable to be included in the analysis and the typical approach to missing data is simply to delete them. However, this leads to ineffective and biased analysis results and is not recommended in the literature. The state of the art technique for handling missing data is multiple imputation. In the paper, some selected multiple imputation methods were taken into account. Special attention was paid to using principal components analysis (PCA) as an imputation method. The goal of the study was to assess the quality of PCA‑based imputations as compared to two other multiple imputation techniques: multivariate imputation by chained equations (MICE) and missForest. The comparison was made by artificially simulating different proportions (10–50%) and mechanisms of missing data using 10 complete data sets from the UCI repository of machine learning databases. Then, missing values were imputed with the use of MICE, missForest and the PCA‑based method (MIPCA). The normalised root mean square error (NRMSE) was calculated as a measure of imputation accuracy. On the basis of the conducted analyses, missForest can be recommended as a multiple imputation method providing the lowest rates of imputation errors for all types of missingness. PCA‑based imputation does not perform well in terms of accuracy.
Źródło:: Acta Universitatis Lodziensis. Folia Oeconomica; 2018, 6, 339; 73-98
0208-6018
2353-7663
Pojawia się w:: Acta Universitatis Lodziensis. Folia Oeconomica
Dostawca treści:: Biblioteka Nauki

Artykuł

Zmień widok

na półce

Informacja

Wyszukujesz frazę "Imputation" wg kryterium: Temat

Źródło danych

Dostawca treści

Kolekcja

Rok wydania

Wydawca

Temat

Autor

Typ dokumentu

Język