Sign in

Data Scientist

Thousand Oaks, California, United States
March 31, 2019

Contact this candidate






Developed Business Intelligence Reports for Amgen using SAP Business Objects with Oracle and Teradata as the backend.

Transformed excel workbooks and raw SQL data to Tableau reports for analytical purposes.

Developed robust dashboard for project managers to track project’s progress against targets, issues pending with various teams and helped managers to arrive at solutions which impacted business financially. J a n ’ 1 8 - May’18 RESEARCH INTERN – ADVANCED ANALYTICS, IMPAQ HEALTH

Developed SAS Macros for Complaints tracking which will import the data from text files and aggregate the data on monthly and yearly basis and validate the complaints and inform the payers and CMS.

Worked on analyzing the data from various reports and detect outliers and noise in the existing datasets.

Developed an Address Fuzzy Match Algorithm to match addresses from Salesforce and CMS in SAS.

Developed a Data Pre-Processing pipeline to convert pdf, pptx, doc, docx to txt files and handled basic text cleaning to prepare the data for Sentiment Analysis. A u g ’ 1 8 - P r e s e n t DATA ENGINEER, AMGEN

Part of the team responsible for Data Ingestion, processing raw data, Data Transformation, Data Cleaning and publishing datasets for 4 internal Business Units to make better Data Science work done on the datasets.

Developed Spark SQL code in the existing infrastructure for the newly emerged business requirements.

Handling anomaly detection in sales KPI’s to predict future and existing ones – Currently designed Standard deviation-based anomaly checks

Created Python notebooks for data extraction, cleaning and wrangling for Onco Foresights project whose goal is to predict top physicians in an account who could potentially prescribe amgen drug for chemotherapy. EDUCATION:

MS in Business Analytics – Data Science, The University of Texas Dallas. CGPA 3.94/4 (May’18) Courses: Advanced Business Analytics, Python, Big Data Analytics, Predictive Analytics, Statistics, Econometrics TECHNICAL SKILLS:

Statistics Tools/Packages: SAS, RShiny, Python (Numpy, Scipy, Matplotlib, Pandas), Stata Database/Data Warehouse: Oracle 10g/11g, MS SQL Server 2008 R2, Teradata, SAP BO, Hive, Redshift ML Algorithms: Linear & Logistic Regression, K-means, SVM, Decision Trees, Random Forest, KNN, Naïve Bayes Visualization Tools: Excel, Tableau, SAP Dashboard ACADEMIC PROJECTS:

• Sports Preference Analysis among Dallas Residents: The aim of this research is to find Dallas resident’s preferences among Dallas Cowboys, Dallas Mavericks, and Texas Rangers in Dallas area. Analyzing the factors which will decides the fan base for the three teams. Methods: Hypothesis Testing, Anova, Linear Regression

• King County’s Housing Sales: Built Machine Learning Models – Linear Regression to predict the selling price of houses in King County. K-Means Clustering to recommend the houses based on its attributes to customers. Handled classification problem using Decision Trees, Logistic Regression and Random Forests.

• Predictive Analytics: Developed statistical models using SAS for a non-profit organization to suggest how to effectively use appeals to target the customers for donation. Used RFM for donors to predict the future donor amounts via linear regression. Also, performed demographic analysis for the dataset.

• R Sentiment Analysis of Amazon Reviews: Analyzed 250K reviews of various products and created unigram, bigram word clouds for normalized and non-normalized text. Calculated sentiment scores for each review and correlated with ratings via scatter plot.

• R Package: Developed a package with functions which will predict whether a person is male or female using Age, Height, Weight with accuracy of 70%. Clustered the dataset using K Means and GMM and then applied KNN based on clusters and evaluated the model. Also visualized the 3D clustering using Plotly with Rshiny.

• Graph Visualization: Analyzed telephonic screening dataset to visually represent social network analysis. DEEPAK SIVARAMAN

SUMMARY: Emerging Data Scientist with 3 years of professional experience in Data Warehousing, Business Intelligence and Data Science seeking full time opportunities in analytics starting immediately.

• Requirement Analysis

• Business Intelligence

• Machine Learning

• Data Cleaning

• Data Analytics

• Data Visualization

Contact this candidate