ANKITA YADAV
507-***-**** Chicago, IL
******.**.*****@*****.*** LinkedIn Tableau Github
• Scraped data from 2,000+ Airbnb listings, analyzing the factors influencing the rental pricing, employing machine learning models such as XGBoost and Decision Trees, achieving a predictive accuracy of 89%. SUMMARY
Results-driven Data Analyst Leveraging strong expertise in SQL, Python, Machine Learning and Data Visualization with experience in R, Tableau, PowerBI and Microsoft Azure to drive strategic decision-making and optimize business processes. EDUCATION
Master of Science in Business Analytics Roosevelt University [GPA: 4.0] August 2023 – December 2024 Bachelor of Science in Data Analytics Mumbai University [GPA: 3.8] August 2017 – July 2021 TECHNICAL SKILLS
Python SQL Tableau PowerBI R Scikit-Learn Pandas NumPy NLTK Matplotlib Seaborn Git GCP AWS Certifications: Tableau Desktop Certified Associate Python Data Associate (Datacamp) GCP Big Data and ML PROFESSIONAL EXPERIENCE
Data Science Intern, TAKEDA ONCOLOGY August 2024 – December 2024
• Extracted and analyzed 10,000+ Twitter data for adverse drug reactions (ADRs) related to antihistamines (Allegra, Zyrtec, Claritin) using Tweepy API and Python, identifying sentiment trends and ADR mentions.
• Applied topic modelling to uncover over 7 previously unreported ADRs, visualizing results through word clouds to inform actionable drug safety insights.
Data Analyst Graduate Assistant, ROOSEVELT UNIVERSITY October 2023 – December 2024
• Analyzed visitor data from over 100 campus tours using Excel, Python, and SQL; identified critical trends that directly influenced recruitment strategies to achieve a 15% increase in overall attendance.
• Developed interactive Tableau dashboards to visualize trends from 1,000+ prospective student records to identify patterns in application demographics, leading to a 10% improvement in targeted student outreach efforts.
• Engineered a student tracking system to streamline the management of admissions data, reducing discrepancies and improving data accuracy by 20%.
• Created a predictive model in Python using survey data from 3,000+ prospective students, identifying factors that boosted application completion rates and increased the Net Promoter Score to 76. Data Analyst, MAESTRO HEALTHCARE August 2021 – August 2023
• Collaborated with data engineers to design and deploy a scalable data warehouse solution, improving data accessibility and reducing query response time by 30%.
• Transformed 15M records from log files and EDW with Log Parser and SQL, streamlining data for the $22M budget ERP application’s development strategy.
• Built data pipelines in AWS using Python and Shell scripts to process 3+ data sources (SAP HANA, JDE, Trade-Edge), loading 10+GB of data into EMR and HANA DBs for real-time PowerBI dashboards.
• Developed and optimized complex SQL queries for data extraction, transformation, and validation, ensuring accuracy in Oracle database testing environments.
• Automated data cleaning processes using Python scripts, decreasing data preparation time by 50% and enabling faster project turnaround.
• Implemented a real-time data visualization dashboard using PowerBI, reducing reporting time by 40% and enhancing decision-making for senior management.
ACADEMIC PROJECTS
Sales Optimization: Interactive Dashboards and Data-Driven Strategies Data Visualization [Tableau, MS SQL]
• Designed 5 interactive sales dashboards in Tableau, analyzing 10,000+ records across 4 regions, improve insights into KPIs like profit ratio and customer segmentation, and implement recommendations that boosted quarterly sales by 5%. Interactive ML-Powered Salary Prediction Tool Machine Learning, Interactive Analysis [Streamlit, Python]
• Deployed a Streamlit app predicting Data Professional salaries using Random Forest Regressor trained on 10,000+ data points, achieving an R score of 0.85, with interactive visualizations exploring salary trends across different features. Rental Price Analysis Machine Learning, Exploratory Data Analysis, Web Scraping [R, Python, BeautifulSoup]