Abstract
Ushbu maqolada axborot tizimlarida uchraydigan yetishmayotgan (NaN) qiymatlarni aniqlash va ularni to‘ldirishga oid nazariy yondashuvlar chuqur tahlil qilinadi.
Ma’lumotlar to‘plamining sifati intellektual tahlil jarayonlarining ishonchliligi va samaradorligiga bevosita ta’sir ko‘rsatadi. Shu bois, NaN qiymatlarni bartaraf etish algoritmlari hozirgi davr axborot texnologiyalarida dolzarb masala sifatida namoyon bo‘lmoqda. Maqolada 7 ta asosiy metod – o‘rtacha qiymat, median, moda, regressiya, K eng yaqin qo‘shni (KNN), interpolatsiya va sun’iy neyron tarmoqlar yordamida to‘ldirish – nazariy jihatdan o‘rganildi. Har bir metod hisoblash murakkabligi, dispersiyaga ta’siri, modelga sezgirlik darajasi, bashorat aniqligi va ma’lumotlar tiplariga nisbatan moslashuvchanlik mezonlari bo‘yicha baholandi. Tadqiqotda metodlarning kuchli va zaif jihatlari asoslangan holda tahlil qilinib, turli kontekstlar uchun optimal yondashuvni tanlash zarurati asoslab berildi. Shuningdek, kombinatsiyalangan (hibrid) algoritmlar yordamida kompleks yondashuvlar ishlab chiqish istiqbollari ham ilgari surildi. Ushbu maqola axborot tizimlarida intellektual tahlil sifatini oshirish, sun’iy intellekt modellarining aniqligini ta’minlash hamda raqamli transformatsiya jarayonlarini samarali tashkil etish uchun muhim nazariy asos bo‘lib xizmat qiladi.
References
Little R.J.A., Rubin D.B. Statistical Analysis with Missing Data. 2nd ed. — Hoboken: Wiley, 2019. — 408 p.
Van Buuren S. Flexible Imputation of Missing Data. 2nd ed. — Boca Raton: CRC Press, 2018. — 352 p.
Zhang S., Yao L., Sun A., Tay Y. Deep Learning Based Missing Data Imputation: A Survey. // IEEE Transactions on Knowledge and Data Engineering. — 2021. — Vol. 34(1). — P. 1–18.
Batista G.E.A.P.A., Monard M.C. A study of K-Nearest Neighbour as an imputation method. // Proceedings of the 2002 Brazilian Symposium on Artificial Intelligence. — Springer, 2002. — P. 251–260.
Hyndman R.J., Athanasopoulos G. Forecasting: Principles and Practice. 3rd ed. — Melbourne: OTexts, 2020. — [Online]. Available: https://otexts.com/fpp3/
Yoon J., Jordon J., van der Schaar M. GAIN: Missing Data Imputation using Generative Adversarial Nets. // Proceedings of the 35th International Conference on Machine Learning (ICML), 2018. — P. 5689–5698.
Pedregosa F. et al. Scikit-learn: Machine Learning in Python. // Journal of Machine Learning Research. — 2011. — Vol. 12. — P. 2825–2830.
Jerez J.M. et al. Missing Data Imputation Using Statistical and Machine Learning Methods in a Real Breast Cancer Problem. // Artificial Intelligence in Medicine. — 2010. — Vol. 50(2). — P. 105–115.
Rubin D.B. Multiple Imputation for Nonresponse in Surveys. — New York: Wiley, 1987. — 258 p.
Schafer J.L. Analysis of Incomplete Multivariate Data. — London: Chapman & Hall, 1997. — 421 p.
Andridge R.R., Little R.J.A. A Review of Hot Deck Imputation for Survey Nonresponse. // International Statistical Review. — 2010. — Vol. 78(1). — P. 40–64.