- Tytuł:
-
The use of data mining models in solving the problem of imbalanced classes based on the example of an online marketing campaign
Wykorzystanie modeli data mining w rozwiązywaniu problemu niezrównoważonych klas na przykładzie kampanii marketingowych w Internecie - Autorzy:
-
Łapczyński, Mariusz
Surma, Jerzy - Powiązania:
- https://bibliotekanauki.pl/articles/424980.pdf
- Data publikacji:
- 2015
- Wydawca:
- Wydawnictwo Uniwersytetu Ekonomicznego we Wrocławiu
- Tematy:
-
C&RT
Random Forest
imbalanced class problem
online social network
banner ad campaign - Opis:
- While building predictive models in analytical CRM, researchers often encounter the problem of imbalanced classes (skewed distributions of dependent variables), which consists in the fact that the number of observations belonging to one category of the dependent variable is much lower than the number of observations belonging to the second category of that variable. This is related to such areas as churn analysis, customer acquisition models and cross and up-selling models. The purpose of the paper is to present a predictive model that was built to predict the response of Internet users to banner advertising. The dataset used in the study came from an online social network which offers advertisers banner campaigns targeting its users. The advertising campaign of a cosmetics company was carried out in the autumn of 2010 and was mainly targeted at young women. A user of this service was described by 115 independent variables – 3 out of which were demographic variables (sex, age, education), and the remaining 112 referred to the user’s online activity. While building the model there appeared the problem of imbalanced classes due to the low number of users who clicked on the banner ad. The number of cases amounted to 81,000, while the number of positive reactions to the banner was 207, which constitutes approximately 0.25% of the dependent variable. During the study, two popular data mining tools were utilized – the decision trees C&RT and Random Forest. The second goal of this paper is to compare the performance of the predictive models based on both these analytical tools.
- Źródło:
-
Econometrics. Ekonometria. Advances in Applied Data Analytics; 2015, 3 (49); 9-19
1507-3866 - Pojawia się w:
- Econometrics. Ekonometria. Advances in Applied Data Analytics
- Dostawca treści:
- Biblioteka Nauki