Geethika Veluri
Contact: ********.******@*****.***, 469-***-****, linkedin, Github
Experienced data scientist with a strong track record of building scalable Analytical solutions for a fortune 500 airline client by designing dashboards, statistical analysis, data modeling and data mining techniques EDUCATION
M.S., Business Analytics, The University of Texas at Dallas (Dean’s Excellence Scholarship), GPA: 3.73 Exp. May 2018 B.S., Electronics and Communication Engineering, Amrita University, India, GPA: 3.4 May 2014 TECHNICAL SKILLS
Data Visualization, Data Mining, Data Analysis, Predictive Modeling, Hypothesis Testing, ANOVA, Time Series Forecasting, Spread Sheet Modeling, Supervised Learning, Principal Component Analysis, Regression (Linear, Logistic, Lasso, Ridge, Splines), Decision Tree, CHAID, Random Forests, Boosting, Bagging, SVM, Neural Networks, Ensembling, Cluster Analysis
• Languages : R, Python (Jupyter Notebooks), SQL, SAS, Excel VBA, Scala, Hive, Stata
• Tools : SAS e Miner, R studio, Microsoft Excel(Advanced), Tableau, Google Analytics, Stata, Microsoft Visio
• Databases : SQL, Teradata SQL, Microsoft Access
• Certifications: Google Analytics IQ, Google Adwords Fundamentals BUSINESS EXPERIENCE
Data Analyst, Mu Sigma Inc., (R, SAS, SQL, Excel, Tableau) Jul 2014 – May 2016
• Built an estimation engine with 98% accuracy using Monte-Carlo simulation and distribution fitting, to make On-Time- Performance predictions for future flight schedules, which were used as sole reference to make business decisions
• Forecasted Head count demand at a department level using Time series techniques(ARIMA) in R to help client foresee demand and take necessary steps to reduce Employee over time hours
• Automated an entire prediction model in RStudio which included Teradata data pulling, data cleaning, data processing and model output prediction which decreased manual effort from 10 hours to 1 hour
• Identified $1Million saving opportunity for the client by identifying the key metrics to define operational performance of airports and implemented PAM, Hierarchical clustering to segment the airports and make targeted decisions
• Built an analytical model using Spline regression to determine optimal turn time for Aircrafts under different scenarios in the network and estimate the turn time savings
• Lead a team of 10 to integrate and automate 25 statistical models in R studio which reduced the man hours by 1 week Student Intern, Robert Bosh Engineering and Business Solutions (MATLAB) Feb 2014 – May 2014
• Reduced manual effort by 15 hours by developing a new path finding algorithm in MATLAB to connect harness cable and Electronic Circuit Unit in a Radio Frequency simulation software ACADEMIC PROJECTS
Web analytics and Digital Marketing (Google Analytics, Google Adwords) Aug 2017 - Dec 2017
• Analyzed the web traffic for Jindal School of Management using google analytics and recommended insights for each department to enhance the user experience on the website.
• Conducted Digital Marketing campaign for a test prep institute in Mumbai (India) on Google Search Network, Display Network and Facebook which increased the client’s revenue by 20% Data science Programming (Python – Pandas, matplotlib, seaborn, Sci-kit-learn, SciPy, NumPi) Aug 2017 - Dec 2017
• Analyzed American congressional campaign data (Kaggle dataset, 2009 - 2016) to identify the areas where the campaign funds are being spent and presented visualizations Advanced Business Analytics with R Jan 2017 – May 2017
• Predicted the churn rate for Telecom data using decision trees and performed break-even analysis for the churn
• Classified text data in to distinct categories based on product description using supervised machine learning algorithms like Naïve Bayes and Support Vector Machines.
Bigdata Analytics (Scala, Hive) Jan 2017 – Dec 2017
• Analyzed Twitter JSON data in Hive and performed sentiment analysis on the tweets
• Preformed data analysis and regression analysis on the brewery data using MLLib in spark Predictive Analytics using SAS (Base SAS, EG miner) Jan 2017 – Dec 2017
• Identified factors driving consumer brand choice preference by analyzing drugs and grocery panel data
• Predicted IMDB score by implementing gradient boosting method on Kaggle movie dataset and identifying factors driving the goodness of a movie