Abstract: | HIV continues to be a global health concern that necessitates cutting-edge methods of diagnosis
and treatment. Owing to the intricate nature of the HIV pandemic, specific strategies are needed
to pinpoint vulnerable people. This study tackles the challenge of precise identification within
specific HIV target groups, namely Adolescent Girls and Young Women (AGYW), High-Risk
Men (HRM), and Female Sex Workers (FSW). Leveraging machine learning algorithms include
Support vector machine, XGBoost, Random forest and linear regression. The research
integrates locally sourced datasets from hospital records, aiming to elevate intervention precision.
The study seeks to transform public health by introducing a data-driven approach to unravel
intricate relationships and variables influencing HIV prevalence among distinct target groups.
Despite progress in global health efforts, traditional methods grapple with precision and efficiency
limitations. The adoption of machine learning offers a promising solution, contributing to a
nuanced understanding of dynamics within key populations. Addressing gaps in existing literature
particularly the scarcity of studies at the intersection of machine learning and the identification of
specific HIV target groups using locally collected datasets.
The study rigorously evaluates the performance of four algorithms on an HIV service delivery
dataset. Results indicate consistently high accuracy across all models, with ensemble approaches
(XGBoost and Random Forest) slightly outperforming others. Notably, Support Vector Machine
achieved 96.33% accuracy, XGBoost reached 96.51%, Random Forest attained 96.49%, and
Linear Regression demonstrated commendable accuracy at 96.28%. This research significantly
contributes to advancing machine learning applications in healthcare and addresses a crucial gap
in the current body of knowledge. |