Paul Benie
+44-755******* ********@*****.***
Personal Statement
I hold an MSc in Data Science from the University of Leicester, with a strong command of Python, R, and SQL for data analysis, statistical modeling, and dashboard development. My experience covers healthcare, environmental, and socioeconomic datasets, where I have utilized inferential statistics, machine learning, and data visualization to produce actionable insights. I excel at curating and cleaning intricate datasets, automating processes, and effectively communicating findings to both technical and non-technical audiences. I am driven by the application of AI and data science to address real-world challenges across various sectors, bringing analytical rigor, meticulous attention to detail, and a collaborative approach. My aim is to harness data- driven solutions to facilitate evidence-based decision-making and generate a positive impact in healthcare, business, and education.
Key Skills
• Languages: Python, R, SQL
• Libraries: pandas, NumPy, Matplotlib, scikit-learn dplyr, tidyr, ggplot2
• Databases: MySQL, PostgreSQL
• Visualisation: Tableau, PowerBI, Quarto, QuickSight (basic)
• Tools: Git/GitHub, Jupyter, RStudio
• Statistics: Hypothesis testing, regression, AUC/ROC, sensitivity/specificity Professional Background
Graduate Data Scientist, University of Leicester, UK- (Sep 2024 – Sep 2025)
• Credit Scoring & Loan Default Prediction (Kaggle Lending Club Dataset) Built machine learning models (Logistic Regression, Random Forest, XGBoost) to forecast loan defaults and utilize SHAP explainable AI to pinpoint major risk factors.
• Functional Data Analysis of Canadian Weather Stations Utilized FDA techniques (B-splines, FPCA) to analyze temperature trends and seasonal fluctuations across Canadian weather stations.
• Functional Data Analysis of Fertility & Cancer Rates (Australia) Applied B-splines and Functional Principal Component Analysis (FPCA) to analyze and illustrate fertility and cancer rate trends in Australia. Source: Australian Institute of Health and Welfare (AIHW) & Australian Bureau of Statistics (ABS) datasets.
• Time-Series Forecasting of River Soar Water Levels (UK) Developed Box-Jenkins (ARIMA) and fractal-based models to predict water levels of the River Soar for flood-risk management. Source: UK Environment Agency River Gauge data.
• Predictive Modelling of Diabetes (USA)
Developed supervised ML models (Logistic Regression, XGBoost) to classify diabetes outcomes, attaining high predictive accuracy and interpretability using SHAP. Source: Pima Indians Diabetes Dataset (UCI Machine Learning Repository)
• Labour Economics Analysis (USA, 1994 Census Data) Analyzed how working hours and demographic factors influence income levels through logistic regression. Source: UCI Adult (Census Income) Dataset, 1994 US Census Bureau. Surveyor, Asante Goldmines & COCOBOD Ghana- (Jan 2021 – Aug 2024)
• Collected, validated, and managed extensive mining and agricultural datasets, ensuring the accuracy and integrity of data for both operational and strategic reporting.
• Implemented quality control processes and created GIS dashboards to visualize production data and assist management in decision-making
• Calculated volumes of water mined materials, and land areas to support compensation claims, production planning, and environmental oversight.
• Developed detailed layouts using ArcGIS, integrating spatial data with analytical models to facilitate planning, compliance, and resource allocation.
Undergraduate Research Assistant, UMaT, Ghana- (2019 – 2020)
• Assisted academic research projects by cleaning, analyzing, and visualizing datasets, ensuring accuracy and consistency of results.
• Contributed to the preparation of technical reports and research outputs and presented key findings to academic supervisors.
• Designed and implemented a project to predict Lake Volta water levels using Box-Jenkins (ARIMA) and fractal techniques for time-series forecasting.
• Applied time-series analysis and advanced statistical modelling to environmental datasets, demonstrating the ability to generate insights for real-world applications. Education
• University of Leicester, UK, MSc in Data Science (2024 – 2025) Modules: Functional Data Analysis, Machine Learning, Database Systems, Statistical Inference
• University of Mines and Technology (UMaT), Ghana, BSc in Geomatic Engineering (2015– 2019) Certifications, Training and Conferences
• Tableau Data Visualization Workshop – University of Leicester (2025)
• Machine Learning with Python – Coursera (2024)
• SQL for Data Science – DataCamp (2024)
• Functional Data Analysis Research Workshop – University of Leicester (2024)
• Git & Version Control Fundamentals – GitHub Learning Lab (2023) Languages
English – Excellent (Speaking, Reading, Writing)
Twi – Good (Speaking), Excellent (Reading, Writing) References
Dr. Ateeq Muhammad – Project Supervisor, University of Leicester Email: ******@*********.**.**
Mr. Prince Amponsah– Senior Surveyor, Kinross Goldmines, Ghana Email: *********@******.***
Dr. Yao Yenvenyo Ziggah – Project Supervisor, University of Mines and Technology (UMaT) Email: ********@****.***.**