Show simple item record

dc.contributor.authorKOLLONGEI, Naomi
dc.date.accessioned2025-02-11T17:57:09Z
dc.date.available2025-02-11T17:57:09Z
dc.date.issued2024
dc.identifier.urihttps://repository.maseno.ac.ke/handle/123456789/6305
dc.descriptionMaster's Thesisen_US
dc.description.abstractThe emergence of big data has revolutionized the way insurance companies deal with data that they receive in the course of their business, big data involves huge volumes of data of different varieties. Therefore the current methods used for analysis such as statistical methods and actuarial formulas in insurance sector are becoming inadequate to solve the emerging problems and opportunities from advancement in technology. Moreover, the data may be prone to missing values. Extreme gradient Boosting Algorithm (XGBoost) which is an ensemble learning which has the capacity to effectively address the two unique characteristics of the data. This research utilized an Extreme boosting algorithm to process insurance claim data in-order to model the frequency of claim and severity of claims for claim prediction. XGBoost creates tree-based models by iteratively fitting decision trees to the residuals of the previous predictions, effectively reducing the error in each iteration. Using the algorithm we aim to enhance the accuracy of predictions that will yield better estimates for improved risk assessment and pricing of insurance products within the insurance sector. The XGBoost algorithm models were evaluated using Root Mean Squared Error (RMSE), Mean Absolute Error (MAE) and Rsquared (RSQ). Results showed that XGBoost models for the claim frequency had a RMSE estimate of 0.949, MAE of 0.7741 and RSQ 0.781 and claim severity model had the metrics 899.12,736.77 and 0.9625 respectively. We also compared the performance of the XGBoost models with zero inflated poisson model, multiple linear regression and generalized Pareto Model. The XGBoost model had the best metrics (RMSE, MAE and RSQ), we therefore concluded that the Extreme Gradient Boosting Model was the optimal model. Key words: Big data, Frequency, Severity, machine learning, gradient boost, XGBoosten_US
dc.publisherMaseno Universityen_US
dc.titleInsurance Claim Analysis Using Extreme Gradient Boosting Trees-A Machine Learning Approachen_US
dc.typeThesisen_US


Files in this item

Thumbnail

This item appears in the following Collection(s)

Show simple item record