Wealth inequality remains a significant challenge in Kenya, exacerbated by the limitations of traditional wealth measurement methods. This study develops and evaluates an ensemble wealth index model combining Random Forest (RF) and Multilayer Perceptron (MLP) algorithms to improve prediction accuracy using socio-economic data from the 2019 Kenya Population and Housing Census (KPHC). The models were assessed using performance metrics such as accuracy, precision, sensitivity, specificity and ROC-AUC. The Random Forest model, configured with 500 trees and a split variable count of 2, achieved 34.3% accuracy, 58.54% balanced accuracy, 65.98% out-of-bag error and performed best on the ”Poorest” class with a 13.7% class error. It further recorded 41.13% precision, 34.27% recall, 83.17% specificity, and a 67.31% AUC. The MLP model, using sigmoid activation in hidden layers and softmax in the output layer, achieved 33.4% accuracy, 57.7% balanced accuracy, 30.6% precision, 32.54% recall, 82.86% specificity and AUC of 69.12%. The RF-MLP ensemble model outperformed the individual models with a 34.4% accuracy, 37.42% precision, 34.05% recall, 83.2% specificity and AUC of 68.55%. Despite modest overall accuracies, the ensemble model showed enhanced balanced accuracy and specificity, particularly in extreme wealth categories. However, classification of middle and poor wealth levels remains challenging due to feature overlap and class imbalance.
Published in | International Journal of Data Science and Analysis (Volume 11, Issue 4) |
DOI | 10.11648/j.ijdsa.20251104.12 |
Page(s) | 114-124 |
Creative Commons |
This is an Open Access article, distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution and reproduction in any medium or format, provided the original work is properly cited. |
Copyright |
Copyright © The Author(s), 2025. Published by Science Publishing Group |
Wealth Index, Ensemble Model, Random Forest, Multilayer Perceptron, Machine Learning
[1] | Anwar, H., Qamar, U., & Muzaffar Qureshi, A. W. (2014). Global optimization ensemble model for classification methods. The Scientific World Journal. |
[2] | Attanasio, O., & Binelli, C. (2003). Inequality, growth, and redistributive policies. Conference on Poverty, Inequalities and Growth: What’s at Stake for Development Aid?, OECD, Paris. |
[3] | Azevedo, B., Amoura, Y., Rocha, A., Fernandes, F., Pacheco, M., & Pereira, A. (2022). Analyzing the MATHE platform through clustering algorithms. In Computational Science and Its Applications – ICCSA 2022 Workshops (pp. 201-218). |
[4] | Barro, R. J. (2000). Inequality and growth in a panel of countries. Journal of Economic Growth, 5(1), 5-32. |
[5] | Bishop, C. M. (2006). Pattern recognition and machine learning. |
[6] | Breiman, L. (2001). Random forests. Machine Learning, 45(1), 5-32. |
[7] | Gopal, M. (2018). Applied machine learning. McGraw Hill Education Private Limited. |
[8] | Haralabopoulos, G., Anagnostopoulos, I., & McAuley, D. (2020). Ensemble deep learning for multilabel binary classification of user-generated content. Algorithms, 13(4), 83. |
[9] | Hornik, K., Stinchcombe, M., & White, H. (1989). Multilayer feedforward networks are universal approximators. Neural Networks, 2(5), 359-366. |
[10] | Kiprop, R. A., Wamwea, P., Imboga, H. M., & Chelule, J. K. (2023). Classification of contraceptive use among undergraduate students using a supervised machine learning technique. American Journal of Theoretical and Applied Statistics, 12(5), 168-176. |
[11] | Martin, L., & Baten, J. (2022). Inequality and life expectancy in Africa and Asia, 1820–2000. Journal of Economic Behavior & Organization, 201, 40-59. |
[12] | Martinangeli, A. F., & Windsteiger, L. (2024). Inequality shapes the propagation of unethical behaviours: Cheating responses to tax evasion along the income distribution. Journal of Economic Behavior & Organization, 220, 135-181. |
[13] | Nderitu, J. N., Imboga, H. M., & Gathuka, M. N. (2022). On the coverage properties of the ratio-based estimator in presence of non-response error. American Journal of Theoretical and Applied Statistics, 11(2), 49-55. |
[14] | Ng’elechei, W. S., & Imboga, H. M. (2020). Modeling frequency and severity of insurance claims in an insurance portfolio. American Journal of Theoretical and Applied Statistics, 9(6), 265-270. |
[15] | Piketty, T. (2014). Capital in the Twenty-First Century (A. Goldhammer, Trans.). Harvard University Press. |
[16] | Ravallion, M. (2012). Why don’t we see poverty convergence? American Economic Review, 102(1), 504-523. |
[17] | Sagi, O., & Rokach, L. (2018). Ensemble learning: A survey. Wiley Interdisciplinary Reviews: Data Mining and Knowledge Discovery, 8(4), e1249. |
[18] | Scheidel, W. (2017). The great leveler: Violence and the history of inequality from the Stone Age to the twenty-first century. Princeton University Press. |
[19] | Turchin, P. (2023). End times: Elites, counter-elites, and the path of political disintegration. Penguin. |
APA Style
Odipo, P. A., Waititu, A., Imboga, H., Mwelu, S. (2025). Predicting Wealth Index Using an Ensemble Model of Random Forest and Multilayer Perceptron. International Journal of Data Science and Analysis, 11(4), 114-124. https://doi.org/10.11648/j.ijdsa.20251104.12
ACS Style
Odipo, P. A.; Waititu, A.; Imboga, H.; Mwelu, S. Predicting Wealth Index Using an Ensemble Model of Random Forest and Multilayer Perceptron. Int. J. Data Sci. Anal. 2025, 11(4), 114-124. doi: 10.11648/j.ijdsa.20251104.12
@article{10.11648/j.ijdsa.20251104.12, author = {Pinkie Akinyi Odipo and Anthony Waititu and Herbert Imboga and Susan Mwelu}, title = {Predicting Wealth Index Using an Ensemble Model of Random Forest and Multilayer Perceptron }, journal = {International Journal of Data Science and Analysis}, volume = {11}, number = {4}, pages = {114-124}, doi = {10.11648/j.ijdsa.20251104.12}, url = {https://doi.org/10.11648/j.ijdsa.20251104.12}, eprint = {https://article.sciencepublishinggroup.com/pdf/10.11648.j.ijdsa.20251104.12}, abstract = {Wealth inequality remains a significant challenge in Kenya, exacerbated by the limitations of traditional wealth measurement methods. This study develops and evaluates an ensemble wealth index model combining Random Forest (RF) and Multilayer Perceptron (MLP) algorithms to improve prediction accuracy using socio-economic data from the 2019 Kenya Population and Housing Census (KPHC). The models were assessed using performance metrics such as accuracy, precision, sensitivity, specificity and ROC-AUC. The Random Forest model, configured with 500 trees and a split variable count of 2, achieved 34.3% accuracy, 58.54% balanced accuracy, 65.98% out-of-bag error and performed best on the ”Poorest” class with a 13.7% class error. It further recorded 41.13% precision, 34.27% recall, 83.17% specificity, and a 67.31% AUC. The MLP model, using sigmoid activation in hidden layers and softmax in the output layer, achieved 33.4% accuracy, 57.7% balanced accuracy, 30.6% precision, 32.54% recall, 82.86% specificity and AUC of 69.12%. The RF-MLP ensemble model outperformed the individual models with a 34.4% accuracy, 37.42% precision, 34.05% recall, 83.2% specificity and AUC of 68.55%. Despite modest overall accuracies, the ensemble model showed enhanced balanced accuracy and specificity, particularly in extreme wealth categories. However, classification of middle and poor wealth levels remains challenging due to feature overlap and class imbalance. }, year = {2025} }
TY - JOUR T1 - Predicting Wealth Index Using an Ensemble Model of Random Forest and Multilayer Perceptron AU - Pinkie Akinyi Odipo AU - Anthony Waititu AU - Herbert Imboga AU - Susan Mwelu Y1 - 2025/09/12 PY - 2025 N1 - https://doi.org/10.11648/j.ijdsa.20251104.12 DO - 10.11648/j.ijdsa.20251104.12 T2 - International Journal of Data Science and Analysis JF - International Journal of Data Science and Analysis JO - International Journal of Data Science and Analysis SP - 114 EP - 124 PB - Science Publishing Group SN - 2575-1891 UR - https://doi.org/10.11648/j.ijdsa.20251104.12 AB - Wealth inequality remains a significant challenge in Kenya, exacerbated by the limitations of traditional wealth measurement methods. This study develops and evaluates an ensemble wealth index model combining Random Forest (RF) and Multilayer Perceptron (MLP) algorithms to improve prediction accuracy using socio-economic data from the 2019 Kenya Population and Housing Census (KPHC). The models were assessed using performance metrics such as accuracy, precision, sensitivity, specificity and ROC-AUC. The Random Forest model, configured with 500 trees and a split variable count of 2, achieved 34.3% accuracy, 58.54% balanced accuracy, 65.98% out-of-bag error and performed best on the ”Poorest” class with a 13.7% class error. It further recorded 41.13% precision, 34.27% recall, 83.17% specificity, and a 67.31% AUC. The MLP model, using sigmoid activation in hidden layers and softmax in the output layer, achieved 33.4% accuracy, 57.7% balanced accuracy, 30.6% precision, 32.54% recall, 82.86% specificity and AUC of 69.12%. The RF-MLP ensemble model outperformed the individual models with a 34.4% accuracy, 37.42% precision, 34.05% recall, 83.2% specificity and AUC of 68.55%. Despite modest overall accuracies, the ensemble model showed enhanced balanced accuracy and specificity, particularly in extreme wealth categories. However, classification of middle and poor wealth levels remains challenging due to feature overlap and class imbalance. VL - 11 IS - 4 ER -