SAMIA CHAUDHRY
Email: ***********@*****.*** #:603-***-**** 8S360 Whittington Ct,
Naperville, IL 60540
PROFESSIONAL PROFILE
Data Scientist/Data Analyst with 8+ years of experience working in analyzing large data sets, coming up with data driven insights for different companies and have worked in Statistical Modelling, Predictive Analytics, Data science, Machine Learning, Time series Analysis.
Established history of achieving goals through insightful listening, working as a problem solver, active team member and a recognized Team leader. Executed complex SQL queries to extract, analyze and manipulate data from large databases.
EDUCATION
Master of Science Grade:A 2019-2021
Elmhurst University
Majors: Data Science and Analytics
Master of Science GPA: 3.5/4.0 2002-2004
Kinnaird College University
Majors: Business Statistics
Thesis: Statistical anlysis and study of drug addicts in public hospitals.
Course Work: Mathematics, Statistics, Economics and Business Management.
Bachelor of Science GPA: 3.6/4.0 2000-2002
Kinnaird College University
Course Work: Mathematics, Statistics and Economics
AREAS OF EXPERTISE
Rich experience of going through all the steps of Data Analysis, the collection, transformation, and organization of data to draw conclusions make predictions for the future, and make informed data driven decisions
Used SQL queries to extract and analyzed data in my projects
Have used Tableau for visualization in my projects
Comprehensive knowledge of Machine Learning Process, having thorough understanding of various phases like Feature Engineering, training data, ML algorithm, model validation
Knows how to transform data into insight for making better decisions.
Extensive experience working with multi-geographies and vendors.
Ability to work in challenging environments and deliver high quality output to the stakeholders.
Comprehensive knowledge of Waterfall and Agile methodology.
Proficient in Python (Pandas, NumPy, Matplotlib,Scikit-Learn), SQL, Microsoft Office, Tableau, Rapid Miner, Salford Predictive Modular (SPM), SAS
Worked in multiple projects in Python, using Pandas for Data analysis, NumPy for scientific computation, Matplotlib for Plotting and Visualizations and Scikit-Learn library for Machine Learning tools
Worked in R programming language in a time series forecasting project
Experience in SAS in a project, used it for analysis
An effective leader with capabilities in motivating teams and maintaining deliverables as per the defined guidelines along with elevation of service standards for operational excellence.
CORPORATE EXPERIENCE
Infosys, United Airlines– From July2022- Till April 2023
Infosys, Hyundai Motors– From Jan 2022 - Jun 2022
Blue Cross Blue Shield- From Oct 2021- Dec 2021
Caterpillar – From Jun 2007-Jun 2011
Skills
Technical Skills
Management Tools
Word, Excel, PowerPoint, Draw.io
Technical Skills
Python, R, SQL, Julia, Power BI, Tableau, Rapid Miner, Minitab, Salford Predictive Modular(SPM) QDA Miner
Functional Expertise
Statistical analysis, Machine Learning, Predictive & Descriptive Analytics, Time Series modeling, NLP, Model development & Validation, Reporting & Dashboard creation, CRIPS-DM methodology
PROJECT DETAILS:
Independent Consultant May2023-July2024
Role: Data Scientist
Analyzed budget and expenditure data to assess the efficiency of resource allocation. Identify opportunities to streamline expenses, reduce costs, and maximize the impact of financial resources.
Conducted needs assessment by analyzing demographic data and community surveys. Identify areas of greatest need and prioritize program interventions accordingly
Analyzed financial data to assess the organization's revenue and expenses
Ran reports involving key insights derived from data analysis
To analyze data effectively and for statistical analysis used Excel, Tableau and SQL
United Airlines July2022-April2023
Role: Data Scientist
Developed an Analytics application to visualize the Flight schedules.
Built Decision models to make recommendations on Flight scheduling issues.
Developed Features to build models for Flight schedule optimization and lead POC to operationalize optimization
Integrated the metrics in UI. Built an application that can analyze, provide multiple dimension metrics, predicts schedule performance based on schedule’s characteristics and provide collaboration of resources (crew, fleets, aircraft, gates)
Worked with the client to understand the requirements
Coordinated with the offshore team
Ran SQL queries to get data from Redshift database
Hyundai Motors Jan2022-Jun2022
Role: Data Scientist/Analyst
Safety Office Data analytics system
Designed a system using machine learning and analytical tools that sends alerts for vehicle that could lead to safety concerns
Worked with multiple data sources, preprocessing in python
Used Pandas library for data analysis, NumPy for scientific computation and for plotting used Matplotlib library
Data from different data sources used for analysis and model development
Worked to implement statistical tools for anomaly detection and investigating potential defects
Remote agile methodology
Ran HIVE queries to get the data
Statistical Alerts and severity alerts were generated, statistical alerts were generated if the modified z-score based on Median and IQR values crosses the threshold
Used Tableau for the visualizations.
Defined and documented clear and complete detailed business requirements and functional specifications using the agile methodology from the Business team
Blue Cross Blue Shield Sep2021-Dec2021
Role: Data Analyst
Benefits Realization
Partnered with EPMO portfolio to evaluate inputs, outputs and the overall health of the processes and date, with the focus on telling the story
Recommendations were provided to improve the process, data and reporting improvements and single tool needed for the data storage for analysis
Data Analytics Consultant 2014-2019
Worked as a Data Analytics consultant for operational efficiencies and community outreach. Developed models to predict number of people attending various large events, helped in reducing cost and improve efficiency
Created model to predict when to open the center for community after covid lockdown
Generated reports and data sheets for various small business clients
Used python programming language to analyze and visualize data
Wrote and ran complex SQL queries
Caterpillar INC – (Caterpillar Logistics) 2007-2011
Role: Inventory Data Analyst
.
Worked for Inbound support Caterpillar Distribution center
Statistical and exploratory data analysis
Interaction with various business stakeholders for requirement gathering
Work with contract processors to reconcile part inventories in the systems and responsible for back-offs if required
Inventory reconciliation for Cat external client Land Rover North America. Adjustments, counts & investigation
Managing processes to ensure that Cat logistics client “Land Rover” inventory accuracy levels are maintained and improved.
Resolving stock record discrepancies by analyzing and interpreting various types of transactions to ensure correct record adjustments.
processing and managing dealer locator claims using SAP.
Running reports for faulty material.
Creating daily count sheets from the pending count list and sending the count sheets to the facilities.
Cycle count adjustments and working with India team.
Bin denials adjustments.
Inventory adjustment reports
Executed vital role as subject matter expert in 6 Sigma projects. Played key role in “LRNA Transition of RAM Reconciliation” project. The project involved replication of existing processes and transition from Morton Record Accuracy Management (RAM) to Bangalore. Specifically, transition of daily processing of low dollar reconciliations of four LRNA warehouses and corresponding inventory adjustment reporting to Bangalore staff.
RECOGNITION:
Received recognition for undertaking key responsibilities and effectively executing the transition of inventory adjustments to the India team, meeting cycle counts deadline and following up to sustained process control, leading to phenomenal progress in regular inventory adjustments.
COURSE PROJECTS:
Credit Risk analysis:
Robust data with 74 attributes, after cleaning dataset and feature engineering created a model in Python to predict credit defaulters of financial institution using Pandas, Numpy, Matplotlib and Scikit-Learn libraries in Python.
Pandas helped in the data analysis, used Numpy for Scientific computations, Matplotlib for plotting and visualization and Scikit-Learn for the machine learning tools.
completion of the model provided notable insights of important features helpful in prediction of active loan default.
The model and findings pointed to recommendations involving when it may be appropriate to write off a high-risk loan and loans should be evaluated in the future.
Predicting patients with cardiovascular disease:
Developed a model that predicted if a patient has cardiovascular disease.
After exploratory analysis of the dataset in Python, did feature engineering, created models, and finalized the model for deployment.
Pandas helped in the data analysis, used Numpy for Scientific computations, Matplotlib for plotting and visualization and Scikit-Learn for the machine learning tools.
Used Logistic regression model to do the prediction
Time Series forecasting:
Created a model in R to forecast IL Air travel.
Studied the air travel of people travelling from IL and to IL, observed lack of trend but found seasonality.
Selected ARIMA model for prediction as it had the smallest residual standard deviation, and it helps in forecast the air travel.