Post Job Free

Resume

Sign in

Data Analyst Python

Location:
Hartford, CT
Posted:
March 27, 2020

Contact this candidate

Resume:

Vishnuventhira B R

** ******** ******, ********, ** 06103 adcgyc@r.postjobfree.com 860-***-**** LinkedIn Github Proactive and Passionate storyteller generating statistical insights into business solutions and data driven decisions. EDUCATION

University of Connecticut School of Business, Hartford Jan 2019 - May 2020 Master of Science in Business Analytics GPA: 3.8/4.00 Anna University, Chennai, India Aug 2011 - Apr 2015 Bachelor of Engineering in Electrical and Electronics Engineering GPA: 7.9/10 TECHNICAL SKILLS

Languages and Tools: R, SQL, Python, Tableau, MS Excel, SAS JMP, AWS, Ansible, Docker, Linux, Scala, Hdfs, Google Analytics, MapReduce, Hive, Spark, Google Cloud Platform, Search Engine Optimization Statistical Techniques: Regression, Clustering, PCA, Naïve Bayes, Boosting, Bagging, KNN, Regularization, Time series, Deep Learning, A/B testing, Hypothesis testing, Customer Analytics, Segmentation Certifications: AWS Certified Solutions Architect, Google Analytics, Red Hat Certified System Engineer Linux 7 WORK EXPERIENCE

Data Scientist Intern – ODN Feb 2020 – Present

• Statistical Modeling: Developed the road risk regression and classification models on geospatial data to predict the traffic counts and crash probability risk in all roads of the California state

• Gathered and processed the road inventory, crash details and weather data of 170 variables of 2M roads in the state. The model results help insurance carriers and DDOT to be aware of the potential road risks

• Collaborated with cross-functional teams and fine-tuned machine learning models by implementing feature selection technique using LASSO

Graduate Research Assistant – University of Connecticut Nov 2019 - Jan 2020

• Forecasting: Predicting energy demand for the New England region by developing time series models on the total load and implemented tsfresh; achieved a variance score of 0.6 Graduate Data Analytics Consultant – LIMRA Aug 2019 – Dec 2019

• Data Analysis, Data Visualization: Analyzed 35 features for 9M insurance policies and identified key factors for the policy lapse; created interactive tableau dashboards to understand the policy lapses at detailed level with respect to gender, risk class and distribution channel

• Statistical Modeling: Predicted potential customers who would terminate their policy by Random Forest model with accuracy of 70 percent; Presented Business insights and recommendations to the Management backed by secondary research; Augmented agent profitability by ~10 percent Data Analyst - Tata Consultancy Services May 2015 – Nov 2018

• Statistical Modeling: Collected large scale data of 20M customer records and built statistical models like logistic regression, Decision Tree in python to identify the churn behaviors and predicted the probable customers who could churn out of Network

• Conducted behavioral segmentation by performing cluster analysis based on the internet, call and text usage by the customers thereby decreased the marketing cost by ~30 percent

• Monitored the transformation of the real time customer voice data by creating business rules using SQL queries and loading it to Teradata via Informatica PowerCenter to generate billing reports

• Automated configuration tasks in network devices like switches and routers by creating code libraries in Ansible and achieved man-hour saving of 8 hours/week ACADEMIC PROJECTS

Connecticut Education Network (Graduate Student Assistant)

• Visualized the number of DDOS attacks per vertical, duration and volume by creating Tableau dashboards Fraud Analytics Travelers Hackathon Data Analysis, Python

• Predicted the probabilities of fraudulent accident claims among 30,000 claims for the Travelers insurance

• Mitigated data imbalance using SMOTE and achieved recall value of 0.65 using Light gradient boosting VIIRS Nighttime Light Population Prediction Python

• Utilized nighttime light brightness data from VIIRS satellite to estimate the county population of USA

• Implemented PCA, Correlation and Outlier analysis; improved model efficiency by 14% from baseline model US Airline Twitter Sentiment Analysis – NLP Python

• Built models to classify twitter text and assign an appropriate positive/negative/neutral label using NLP

• Studied the popularity of airlines and performed text preprocessing steps using NLTK modules in python



Contact this candidate