Machine learning-based prediction and identification of determinants of teenage pregnancy in ten East African countries.
| Title: | Machine learning-based prediction and identification of determinants of teenage pregnancy in ten East African countries. |
|---|---|
| Authors: | Baykemagn ND; Department of Health Informatics, Institute of Public Health, College of Medicine and Health Sciences, University of Gondar, Gondar, Ethiopia. nebebe2@gmail.com.; Gebiru AM; Departments of Health informatics, Teda Health Science College, Gondar, Ethiopia.; Department of Epidemiology and Biostatistics, Institute of Public Health, College of Medicine and Health Sciences, University of Gondar, Gondar, Ethiopia.; Getnet M; Department of Epidemiology and Biostatistics, Institute of Public Health, College of Medicine and Health Sciences, University of Gondar, Gondar, Ethiopia.; Department of Human Physiology, School of Medicine, College of Medicine and Health Sciences, University of Gondar, Gondar, Ethiopia.; Mengistie BA; Department of General Midwifery, School of Midwifery, College of Medicine and Health Sciences, University of Gondar, Gondar, Ethiopia.; Getahun AB; Department of Anesthesia, School of Medicine, College of Medicine and Health Sciences, University of Gondar, Gondar, Ethiopia.; Asmamaw DB; Department of Reproductive Health, Institute of Public Health, College of Medicine and Health Sciences, University of Gondar, Gondar, Ethiopia.; Tassew WC; Department of Medical Nursing, Teda Health Science College, Gondar, Ethiopia.; Tilahun MM; Department of Human Physiology, School of Medicine, College of Medicine and Health Sciences, University of Gondar, Gondar, Ethiopia.; Bizuneh YB; Department of Anesthesia, School of Medicine, College of Medicine and Health Sciences, University of Gondar, Gondar, Ethiopia.; Bitew DA; Department of Reproductive Health, Institute of Public Health, College of Medicine and Health Sciences, University of Gondar, Gondar, Ethiopia.; Negash HK; Department of Epidemiology and Biostatistics, Institute of Public Health, College of Medicine and Health Sciences, University of Gondar, Gondar, Ethiopia.; Department of Human Anatomy, School of Medicine, College of Medicine and Health Sciences, University of Gondar, Gondar, Ethiopia.; Melese M; Department of Human Physiology, School of Medicine, College of Medicine and Health Sciences, University of Gondar, Gondar, Ethiopia. |
| Source: | Scientific reports [Sci Rep] 2026 Mar 11; Vol. 16 (1). Date of Electronic Publication: 2026 Mar 11. |
| Publication Type: | Journal Article |
| Language: | English |
| Journal Info: | Publisher: Nature Publishing Group Country of Publication: England NLM ID: 101563288 Publication Model: Electronic Cited Medium: Internet ISSN: 2045-2322 (Electronic) Linking ISSN: 20452322 NLM ISO Abbreviation: Sci Rep Subsets: MEDLINE |
| Imprint Name(s): | Original Publication: London : Nature Publishing Group, copyright 2011- |
| MeSH Terms: | Pregnancy in Adolescence*/statistics & numerical data ; Machine Learning*; Africa, Eastern/epidemiology ; Adolescent ; Pregnancy ; Female ; Humans ; Young Adult ; East African People |
| Abstract: | About 21 million teenagers became pregnant annually throughout the globe. Teen pregnancy is a serious issue in Sub-Saharan Africa, with East Africa reporting the highest rates. In the field of public health, machine learning has become an invaluable tool due to its ability to process large, complex datasets and identify trends. This study uses machine learning to predict and identify key determinants of teenage pregnancy in East Africa, utilizing DHS. A supervised machine learning approach, specifically the Random Forest algorithm, was applied to analyze relationships between predictors and teenage pregnancy outcomes. Data preprocessing included handling missing values, feature scaling, and addressing class imbalance using Tomek Links and SMOTE Model performance was evaluated using metrics such as accuracy, confusion matrix, and ROC AUC. The final model was validated on a separate test set to ensure generalizability and predictive accuracy. Random Forest demonstrated superior performance, with an AUC of 94.6, an accuracy of 89.1%, an F1 score of 89%, a recall of 88%, and a precision of 90%. Kenya had the highest rate of teenage pregnancies at 19.1%, with a 95% confidence interval of [18.12%, 20.08%]. Key predictors of teenage pregnancy in East Africa include maternal education, marital status, age at first sexual intercourse, wealth status, place of residence, distance to health facilities, and social media usage. These findings suggest that expanding reproductive health services in rural areas, with strengthened youth-friendly services; promoting education about teenage pregnancy through social media; and integrating reproductive health education into school curricula may decrease teenage pregnancy in East Africa.; (© 2026. The Author(s).) |
| Competing Interests: | Declarations. Competing interests: The authors declare no competing interests. Ethical approval: This study was based on secondary data analysis using publicly available datasets from the Demographic and Health Surveys (DHS) Program. Access to the data was formally requested and granted through the DHS Program’s official data request platform ( https://dhsprogram.com/data/Access-Instructions.cfm ). All datasets used were fully anonymized and contained no personally identifiable information, in accordance with the DHS Program’s data protection protocols. Given the use of de-identified, publicly accessible data, Institutional Review Board (IRB) approval was not required, as per DHS guidelines and our institutional research ethics policies. Consent for publication: Not applicable. Patient and public involvement: Patients and the public were not involved in the design, conduct, reporting, or dissemination plans of this research. Patient consent for publication: Not required. Provenance and peer review: Not commissioned; externally peer reviewed. Clinical trial: Not applicable. |
| References: | WHO. https://www.who.int/news-room/fact-sheets/detail/adolescent-pregnancy.; Adolescent pregnancy - World Health Organization. (WHO). https://www.who.int/news-room/fact.; Maharaj, N. R. Adolescent pregnancy in sub-Saharan Africa–a cause for concern. Front. Reproductive Health. 4, 984303 (2022). (PMID: 10.3389/frph.2022.984303); Alemayehu, M. A. et al. Spatial distribution of teenage pregnancy and its associated factors in Ethiopia: spatial and multilevel analysis of EDHS 2019. Arch. Public. Health. 82(1), 165 (2024). (PMID: 393275961142610010.1186/s13690-024-01380-8); Ayele, B. G., Gebregzabher, T. G., Hailu, T. T. & Assefa, B. A. Determinants of teenage pregnancy in Degua Tembien District, Tigray, Northern Ethiopia: A community-based case-control study. PLoS ONE. 13(7), e0200898 (2018). (PMID: 30044850605945110.1371/journal.pone.0200898); Gunawardena, N., Fantaye, A. W. & Yaya, S. Predictors of pregnancy among young people in sub-Saharan Africa: A systematic review and narrative synthesis. BMJ Glob Health. 4(3), e001499 (2019). (PMID: 31263589657098610.1136/bmjgh-2019-001499); Liga, A. D., Boyamo, A. E., Jabir, Y. N. & Tereda, A. B. Prevalence and correlates associated with early childbearing among teenage girls in Ethiopia: A multilevel analysis. PLoS One. 18 (8), e0289102 (2023). (PMID: 375526981040926810.1371/journal.pone.0289102); Zahra, F., Austrian, K., Gundi, M., Psaki, S. & Ngo, T. Drivers of marriage and health outcomes among adolescent girls and young women: Evidence From Sub-Saharan Africa and South Asia. J. Adolesc. Health. 69 (6s), S31–s8 (2021). (PMID: 3480989710.1016/j.jadohealth.2021.09.014); Moshi, F. V. & Tilisho, O. The magnitude of teenage pregnancy and its associated factors among teenagers in Dodoma Tanzania: A community-based analytical cross-sectional study. Reproductive Health. 20 (1), 28 (2023). (PMID: 36737763989679610.1186/s12978-022-01554-z); https://data.unicef.org/topic/child-health/adolescent-health/.; Kassa, G. M., Arowojolu, A., Odukogbe, A. & Yalew, A. W. Prevalence and determinants of adolescent pregnancy in Africa: A systematic review and meta-analysis. Reproductive health. 15, 1–17 (2018). (PMID: 10.1186/s12978-018-0640-2); Worku, M. G., Tessema, Z. T., Teshale, A. B., Tesema, G. A. & Yeshaw, Y. Prevalence and associated factors of adolescent pregnancy (15–19 years) in East Africa: A multilevel analysis. BMC Pregnancy Childbirth. 21, 1–8 (2021). (PMID: 10.1186/s12884-021-03713-9); Demographic and Health Survey (Ethiopia, Uganda, Kenya, and Tanzania)2016.; Javaid, M., Haleem, A., Singh, R. P., Suman, R. & Rab, S. Significance of machine learning in healthcare: Features, pillars and applications. Int. J. Intell. Networks. 3, 58–73 (2022).; Coenen, L., Bellekens, P., Kadji, C., Carlin, A. & Tecco, J. Teenage pregnancy in Belgium: Protective factors in a migrant population. Psychiatr Danub. 31(Suppl 3), 400–405 (2019). (PMID: 31488760); Harada, R., Imoto, A., Ndunyu, L. & Masuda, K. The reasons for and influences of unintended teenage pregnancy in Kericho county, Kenya: A qualitative study. Reprod. Health. 21(1), 143 (2024). (PMID: 393799711146298710.1186/s12978-024-01872-4); Nuwabaine, L., Sserwanja, Q., Kamara, K. & Musaba, M. W. Prevalence and factors associated with teenage pregnancy in Sierra Leone: Evidence from a nationally representative Demographic and Health Survey of 2019. BMC Public. Health. 23 (1), 527 (2023). (PMID: 369415681002638910.1186/s12889-023-15436-x); Abortion - World Health Organization (WHO).; Basu, G., Chakraborty, U. & Halder, I. Contraceptive use, unmet need and its determinants among tribal married reproductive women: A community based observational study in a district of West Bengal. J. Family Med. Prim. Care. 13(6), 2389–2396 (2024). (PMID: 390278271125406810.4103/jfmpc.jfmpc_1580_23); Ezenwaka, U. et al. Exploring factors constraining utilization of contraceptive services among adolescents in Southeast Nigeria: An application of the socio-ecological model. BMC Public. Health. 20(1), 1162 (2020). (PMID: 32711497738285710.1186/s12889-020-09276-2); Machira, K. & Palamuleni, M. E. Health care factors influencing teen mothers’ use of contraceptives in Malawi. Ghana. Med. J. 51(2), 88–93 (2017). (PMID: 289551055611910); Schwandt, H. M. et al. Inadequate birth spacing is perceived as riskier than all family planning methods, except sterilization and abortion, in a qualitative study among urban Nigerians. BMC Womens Health. 17 (1), 80 (2017). (PMID: 28893235559446710.1186/s12905-017-0439-2); Abuladze, N., Vincent, R., Draper, J. & Asatiani, T. Incidence of teenage pregnancy in georgia and australia. Patterns of social acceptance and related healthcare concerns. Georgian Med. News. 289, 15–20 (2019).; Beyene, F. Y., Tesfu, A. A., Wudineh, K. G. & Wassie, T. H. Magnitude and its associated factors of teenage pregnancy among antenatal care attendees in Bahir Dar city administration health institutions, northwest, Ethiopia. BMC Pregnancy Childbirth. 22(1), 799 (2022). (PMID: 36309679961734110.1186/s12884-022-05130-y); Sychareun, V. et al. Determinants of adolescent pregnancy and access to reproductive and sexual health services for married and unmarried adolescents in rural Lao PDR: A qualitative study. BMC Pregnancy Childbirth. 18(1), 219 (2018). (PMID: 29884139599410010.1186/s12884-018-1859-1); Wasswa, R., Kabagenyi, A., Kananura, R. M., Jehopio, J. & Rutaremwa, G. Determinants of change in the inequality and associated predictors of teenage pregnancy in Uganda for the period 2006–2016: Analysis of the Uganda Demographic and Health Surveys. BMJ Open. 11(11), e053264 (2021). (PMID: 34753766857898810.1136/bmjopen-2021-053264); Byonanebye, J. et al. Geographic variation and risk factors for teenage pregnancy in Uganda. Afr. Health Sci. 20(4), 1898–1907 (2020). (PMID: 34394256835186810.4314/ahs.v20i4.48); Kumma, W. P. et al. Prevalence of teenage pregnancy and associated factors among preparatory and high school students in Wolaita Sodo town, southern Ethiopia: An institution-based cross-sectional study. BMJ Open. 13 (6), e070505 (2023). (PMID: 372958301027708010.1136/bmjopen-2022-070505); Musinguzi, M. et al. Prevalence and correlates of teenage pregnancy among in-school teenagers during the COVID-19 pandemic in Hoima district western Uganda—A cross sectional community-based study. PLoS ONE. 17 (12), e0278772 (2022). (PMID: 36525426975758910.1371/journal.pone.0278772); Ahmad, T. & Aziz, M. N. Data preprocessing and feature selection for machine learning intrusion detection systems. ICIC Express Lett. 13 (2), 93–101 (2019).; Cai, J., Luo, J., Wang, S. & Yang, S. Feature selection in machine learning: A new perspective. Neurocomputing 300, 70–79 (2018). (PMID: 10.1016/j.neucom.2017.11.077); Werner de Vargas, V., Schneider Aranda, J. A., dos Santos Costa, R., da Silva Pereira, P. R. & Victória Barbosa, J. L. Imbalanced data preprocessing techniques for machine learning: A systematic mapping study. Knowl. Inf. Syst. 65 (1), 31–57 (2023). (PMID: 3640595710.1007/s10115-022-01772-8); Talaei Khoei, T. & Kaabouch, N. Machine learning: Models, challenges, and research directions. Future Internet. 15 (10), 332 (2023). (PMID: 10.3390/fi15100332); Maleki, F., Muthukrishnan, N., Ovens, K., Reinhold, C. & Forghani, R. Machine learning algorithm validation: From essentials to advanced applications and implications for regulatory certification and deployment. Neuroimaging Clin. 30(4), 433–445 (2020). (PMID: 10.1016/j.nic.2020.08.004); https://www.geeksforgeeks.org/random-forest-hyperparameter-tuning-in-python/.; Yehuala, T. Z. et al. Machine learning algorithms to predict healthcare-seeking behaviors of mothers for acute respiratory infections and their determinants among children under five in sub-Saharan Africa. Front. Public. Health. 12, 1362392 (2024). (PMID: 389627621122018910.3389/fpubh.2024.1362392); Loef, B. et al. Using random forest to identify longitudinal predictors of health in a 30-year cohort study. Sci. Rep. 12(1), 10372 (2022). (PMID: 35725920920952110.1038/s41598-022-14632-w); Chen, H., Zhang, X. & Bian, W. Using machine learning to explore the predictors of life satisfaction trajectories in older adults. Appl. Psychol. Health Well Being. 16(4), 2190–2203 (2024). (PMID: 3914269310.1111/aphw.12579); Ng, D. K. et al. Development of an adaptive clinical web-based prediction tool for kidney replacement therapy in children with chronic kidney disease. Kidney Int. 104(5), 985–994 (2023). (PMID: 373910411059209310.1016/j.kint.2023.06.020); Speiser, J. L. A random forest method with feature selection for developing medical prediction models with clustered and longitudinal data. J. Biomed. Inf. 117, 103763 (2021). (PMID: 10.1016/j.jbi.2021.103763); Exarchos, L. M. et al. Teenage pregnancies in Western Greece: Experience from a university hospital setting. Horm. (Athens). 21(1), 127–131 (2022). (PMID: 10.1007/s42000-021-00337-8); Eyeberu, A. et al. Teenage pregnancy and its predictors in Africa: A systematic review and meta-analysis. Int. J. Health Sci. (Qassim). 16(6), 47–60 (2022). (PMID: 364750349682880); Kefale, B., Yalew, M., Damtie, Y. & Adane, B. A multilevel analysis of factors associated with teenage pregnancy in Ethiopia. Int. J. Womens Health. 12, 785–793 (2020). (PMID: 33116928754801810.2147/IJWH.S265201); Mekonen, E. G. Pooled prevalence and associated factors of teenage pregnancy among women aged 15 to 19 years in sub-Saharan Africa: evidence from 2019 to 2022 demographic and health survey data. Contracept. Reprod. Med. 9(1), 26 (2024). (PMID: 387784181111282710.1186/s40834-024-00289-5); Phiri, M., Kasonde, M. E., Moyo, N., Sikaluzwe, M. & Simona, S. A multilevel analysis of trends and predictors associated with teenage pregnancy in Zambia (2001–2018). Reprod. Health. 20(1), 16 (2023). (PMID: 36653839984802810.1186/s12978-023-01567-2); Uwizeye, D., Muhayiteto, R., Kantarama, E., Wiehler, S. & Murangwa, Y. Prevalence of teenage pregnancy and the associated contextual correlates in Rwanda. Heliyon 6(10), e05037 (2020). (PMID: 33083588755090410.1016/j.heliyon.2020.e05037); Zemene, M. A. et al. Trends and factors associated with teenage pregnancy in Ethiopia: Multivariate decomposition analysis. Sci. Rep. 14(1), 2216 (2024). (PMID: 382788421081795210.1038/s41598-024-52665-5); Mbongwa, L., Mpanza, S. & Mlambo, V. H. The Teenage pregnancy crisis in South Africa among high school students, causes, implications and possible solutions: A literature review. Futurity Educ. 4(3), 200–216 (2024). (PMID: 10.57125/FED.2024.09.25.12); Alunyo, J. P. et al. Factors Associated with Teenage Pregnancies During the Covid-19 Period in Pakwach District, Northern Uganda: A Case-Control Study. Adolesc. Health Med. Ther. 15, 93–108 (2024). (PMID: 3974917511693854); Birhanu, B. E., Kebede, D. L., Kahsay, A. B. & Belachew, A. B. Predictors of teenage pregnancy in Ethiopia: A multilevel analysis. BMC Public. Health. 19(1), 601 (2019). (PMID: 31101101652555110.1186/s12889-019-6845-7); Donatus, L., Sama, D. J., Tsoka-Gwegweni, J. M. & Cumber, S. N. Factors associated with adolescent school girl’s pregnancy in Kumbo East Health District North West region Cameroon. Pan Afr. Med. J. 31, 138 (2018). (PMID: 31037198646249210.11604/pamj.2018.31.138.16888); Menon, J. A., Kusanthan, T., Mwaba, S. O. C., Juanola, L. & Kok, M. C. Ring’ your future, without changing diaper—Can preventing teenage pregnancy address child marriage in Zambia? PLoS ONE. 13 (10), e0205523 (2018). (PMID: 30346966619763510.1371/journal.pone.0205523); Ali, A., Khaliq, A., Lokeesan, L., Meherali, S. & Lassi, Z. S. Prevalence and predictors of teenage pregnancy in Pakistan: A trend analysis from Pakistan Demographic and Health Survey datasets from 1990 to 2018. Int. Health. 14 (2), 176–182 (2022). (PMID: 34013327889080610.1093/inthealth/ihab025); Fotso, J. C., Cleland, J. G., Muki, B., Adje Olaitan, E. & Ngo Mayack, J. Teenage pregnancy and timing of first marriage in Cameroon-What has changed over the last three decades, and what are the implications? PLoS One. 17 (11), e0271153 (2022). (PMID: 36395149967131310.1371/journal.pone.0271153); Nshutiyukuri, C. et al. Perceived factors contributing to teenage pregnancy and their perceived effects on teenage females health in eastern province of Rwanda. Womens Health (Lond). 21, 17455057251325044 (2025). (PMID: 401009621192100310.1177/17455057251325044); Chirwa, G. C. et al. An evolution of socioeconomic related inequality in teenage pregnancy and childbearing in Malawi. PLoS ONE. 14(11), e0225374 (2019). (PMID: 31747437686764910.1371/journal.pone.0225374); Tigabu, S., Liyew, A. M. & Geremew, B. M. Modeling spatial determinates of teenage pregnancy in Ethiopia: Geographically weighted regression. BMC Womens Health. 21 (1), 254 (2021). (PMID: 34167542822336810.1186/s12905-021-01400-7); Amenu, M., Tediso, D., Feleke, T., Fantahun, S. & Woldesemayat, E. M. Pregnancy and associated factors among teenage females in Hula District, Sidama region, Ethiopia: A community-based cross-sectional study. Front. Reprod. Health. 6, 1367436 (2024). (PMID: 393096161141293810.3389/frph.2024.1367436); Fernandes, C. M., Conceicao, G. M. S., Silva, Z. P. D., Nampo, F. K. & Chiaravalloti Neto, F. Socioeconomic factors increase the risk of teenage pregnancy: Spatial and temporal analysis in a Brazilian municipality. Rev. Bras. Epidemiol. 27, e240040 (2024). (PMID: 3908247611290768); Mkwananzi, S. S. Provincial differentials of the effect of internal migration on teenage fertility in South Africa. Afr. J. Reprod. Health. 26(11), 119–128 (2022). (PMID: 37585139); Okiror Okello, E., Musinguzi, M., Opollo, M. S., Eustes, K. & Akello, A. R. Factors associated with teenage pregnancy among refugees in Palabek refugee settlement, Northern Uganda. BMC Pregnancy Childbirth. 24 (1), 708 (2024). (PMID: 394728161152359310.1186/s12884-024-06909-x); Ninsiima, L. R., Chiumia, I. K. & Ndejjo, R. Factors influencing access to and utilisation of youth-friendly sexual and reproductive health services in sub-Saharan Africa: A systematic review. Reproductive health. 18, 1–17 (2021). (PMID: 10.1186/s12978-021-01183-y); Jonas, K., Crutzen, R., van den Borne, B., Sewpaul, R. & Reddy, P. Teenage pregnancy rates and associations with other health risk behaviours: A three-wave cross-sectional study among South African school-going adolescents. Reprod. Health. 13(1), 50 (2016). (PMID: 27142105485535810.1186/s12978-016-0170-8); Kaphagawani, N. C. & Kalipeni, E. Sociocultural factors contributing to teenage pregnancy in Zomba district, Malawi. Glob Public. Health. 12(6), 694–710 (2017). (PMID: 2768724210.1080/17441692.2016.1229354); Ochen, A. M., Chi, P. C. & Lawoko, S. Predictors of teenage pregnancy among girls aged 13–19 years in Uganda: a community based case-control study. BMC Pregnancy Childbirth. 19(1), 211 (2019). (PMID: 31234816659194810.1186/s12884-019-2347-y); Kyilleh, J. M., Tabong, P. T. N. & Konlaan, B. B. Adolescents’ reproductive health knowledge, choices and factors affecting reproductive health choices: A qualitative study in the West Gonja District in Northern region, Ghana. BMC Int. health Hum. rights. 18, 1–12 (2018). (PMID: 10.1186/s12914-018-0147-5); Swami, D. & Rao, P. V. P. Effects of teenage pregnancies on the health, nutrition, and development of first-born children: A community-based comparative study in a rural district at Bhopal. J. Family Med. Prim. Care. 13 (6), 2216–2220 (2024). (PMID: 390278341125404610.4103/jfmpc.jfmpc_1320_22); Aguia-Rojas, K., Gallego-Ardila, A. D., Estrada Bonilla, M. V. & Rodriguez-Nino, J. N. Individual and contextual factors associated with teenage pregnancy in Colombia: A multilevel analysis. Matern Child. Health J. 24 (11), 1376–1386 (2020). (PMID: 3281507810.1007/s10995-020-02997-1); Alamneh Gebeyehu, A. et al. Trends change in teen pregnancy among adolescent women in Ethiopia based on Ethiopian demographic and health surveys: Multivariate decomposition analysis. PLoS ONE. 18 (6), e0287460 (2023). (PMID: 373521891028934210.1371/journal.pone.0287460); Nagandla, K. & Kumar, K. Prevalence of teenage pregnancy in 2015–2016 and its obstetric outcomes compared to non-teenage pregnancy at Hospital Tuanku Ja’afar Seremban (HTJS), Negeri Sembilan, Malaysia: A retrospective case-control study based on the national obstetric registry. Malays Fam Physician. 15 (2), 2–9 (2020). (PMID: 328439397430313) |
| Contributed Indexing: | Keywords: Africa; Data science; Digital health; Machine learning; Teenage pregnancy |
| Entry Date(s): | Date Created: 20260312 Date Completed: 20260421 Latest Revision: 20260424 |
| Update Code: | 20260424 |
| PubMed Central ID: | PMC13100203 |
| DOI: | 10.1038/s41598-026-43004-x |
| PMID: | 41813796 |
| Database: | MEDLINE |
Journal Article