Sara Pournourbakhsh
Data Scientist and Machine Learning Engineer
Boyds, MD 20841 443-***-**** *********@*****.*** LinkedIn https://www.kaggle.com/sarapournourbakhsh
https://github.com/sarapnour
PROFESSIONAL SUMMARY
I am a passionate and results-driven Data Scientist and Machine Learning Engineer with over 3 years of experience in developing and deploying machine learning models to tackle complex real-world problems. My expertise lies in transforming raw data into actionable insights and building predictive models that drive strategic decision-making. With a strong foundation in data cleaning, feature engineering, model development, and data visualization, I thrive in collaborative and dynamic environments.
Currently pursuing a master’s degree in data science and Machine Learning Engineering at the University of Maryland, Baltimore County, I am committed to continuous learning and staying at the forefront of technological advancements. I am a strong communicator and team player, dedicated to leveraging my skills to contribute to innovative projects and make a meaningful impact. EDUCATION & CERTIFICATIONS
Master of Data Science and Machine Learning Engineering Current GPA: 3.85 University of Maryland, Baltimore County (UMBC), MD, US January 2024 – Present Professional Certificate in Data Science and Business Analytics 2023 University of Maryland, College Park, MD, US
Master of Science Agriculture Engineering 2016
Azad University, Science and Research Branch, Tehran, Iran Bachelor of Science Agriculture Engineering 2014
Azad University, Science and Research Branch, Tehran, Iran SKILLS
Technical Skill: Python (Various Python Libraries), ML, AI, Visualization, SQL, NoSQL, AWS MS SQL Server, MySql, MongoDB, Hive
Framework: Apache Spark, Hadoop HDFS MapReduce, Tez, PyTorch Tools: Anaconda, Jupiter, MongoDB Compass, MySQL Workbench, Azure Data Studio, Imapla, Tableau, GIT, Docker, VMware, VirtualBox
Microsoft Tools: Word, Excel, PowerPoint, Team, Outlook RELEVANT PROJECTS
Hear Disease Prediction Model
https://www.kaggle.com/code/sarapournourbakhsh/heart-disease-prediction-model In this project I developed a predictive model for assessing the risk of heart attacks. Worked with real healthcare data and leveraging cutting-edge machine learning techniques to designing and implementing a robust solution to identify individuals at high risk of cardiovascular events.
• Leveraged 253,680 survey responses from cleaned BRFSS 2015 as comprehensive datasets containing demographic information, medical history, lifestyle factors, and clinical biomarkers related to heart health. Conducted thorough data cleaning and preprocessing to address missing values, outliers, and inconsistencies.
• Engineered meaningful features from raw data sources, including age, gender, blood pressure, cholesterol levels, and family history of heart disease. Employed feature selection techniques such as correlation analysis and recursive feature elimination to identify the most predictive variables.
• Implemented state-of-the-art machine learning algorithms such as logistic regression, random forest, and gradient boosting. Fine-tuned hyperparameters using techniques like grid search and Bayesian optimization to optimize model performance.
• Evaluated model performance metrics including accuracy, precision, recall, and area under the ROC curve (AUC) through cross-validation and holdout validation.
• Employed model interpretation techniques such as SHAP (SHapley Additive exPlanations) values and feature importance analysis to understand the factors contributing to predictions. Reducing Customer Churn in Telecom Industry
https://www.kaggle.com/code/sarapournourbakhsh/customer-churn?scriptVersionId=188889732 Predicting customer churn for telecommunication companies to be able to effectively retain customers. Large telecommunications corporations are seeking to develop models to predict which customers are more likely to change and take actions accordingly. To complete this project the following steps implemented:
• Preprocess data (convert columns into appropriate formats, handle missing values, etc.).
• Conduct appropriate exploratory analysis to extract useful insights (whether directly useful for business or for eventual modeling/feature engineering).
• Derive new features.
• Visualize data and check for any outliers.
• Handle class imbalance using appropriate techniques,
• Train a variety of models (machine learning and deep learning models) and tune model hyperparameters.
• Evaluate the models using appropriate evaluation metrics. Note that it is more important to accurately identify churners than the non-churners. Therefore, choose an appropriate evaluation metric that reflects this business goal.
• Finally, choose Logistic Regression and Exploratory Data Analysis as the best model based on its performance and interpretability.
• Drawing conclusions — Summary
RELEVANT EXPERIENCE
Data Analyst - Pasargad Electronic
February 2014 - February 2018
Responsible analyze various patient’s EMR and HER to help healthcare provider make clinical decisions.
• Compiled and analyzed business related data to identify insights and areas for improvement decision making. Utilize Tableau for visualization and create interactive graph.
• Worked with primary doctors and development team to implement supervised Machine Learning models.
• Utilized SQL for data extraction and data analysis and import data into Excel and Python notebook to analyze and create data insight.