Sign in

Data Analyst Customer

Pleasanton, CA
October 15, 2019

Contact this candidate


Professional Summary and Skills

Over *+ years of experience as a Data Analyst in customer and market analytics to derive data-driven decision

Experience in building and automating dashboards to monitor and track KPIs BI tools like Tableau, Power BI, Python

Very efficient in interpreting and communicating large and complex data with exploratory data analysis, descriptive statistics, data modeling, visualization and presentations on customer analytics, behavior, web and market analytics

Proficient in data analytics and modeling using Python package- NumPy, Pandas, Stats Models, SciPy, Scikit-Learn

Experience in statistical modeling methods- Linear Regression, Discriminant Analysis, Feature Selection, PCA

Experience in building machine learning models- Linear model, SVM, Clustering, KNN, K-mean, Neural Networks

Good knowledge in database design using PL/SQL- stored procedures, Functions, Triggers with MySQL, PostgreSQL

Experience in designing, developing, modeling and managing Relational databases using ERwin, Snowflakes

Proficient in automating data collection, translation and store using ETL pipeline tools like Talend, Informatica

Experience in coding with Python, R, SPSS, SAS, VBA, PL/SQL, HTML


National Science Foundation Fresno, CA

Data Analyst 01/2018 – Present

Integrated data from 25 projects and established MySQL data warehouse over 1 million rows.

Built Logistic Regression model to predict operational cost overrun, utilized t-test, ANOVA and decision tree techniques for feature selection. Identified 6 out of 50 variables impacting operational cost overrun.

Automated data collection using SQL, preprocessing and trained Logistic Regression model to predict the cost overrun and Tableau dashboards to track KPIs of cost overrun. Saved 11% in the operational cost overrun.

Toodledo Inc San Ramon, CA

Data Analyst 05/2017 – 02/2018

Responsible for data mining, analysis and modeling using Python, SQL, Tableau, Talend, Excel, Teradata

Optimized the processing time for data collection, manipulation and report processing from hours to minutes using Python script, Talend for ETL pipeline and Tableau for visualization.

Combined social data and ad-platform data to draw insights on user groups, improved the targeted marketing by 34%

Conducted A/B testing in defining the scope of tests, choosing the right success metrics, ensuring tracking, and analyzing results

Web Ninjaz Technology Delhi, India

Data Analyst 05/2015 – 08/2016

Responsible for data mining, analysis and generating reports using SQL, Talend, Tableau, Hadoop, HBase

Automated data to various marketing channels using Talend, SQL, and Python to aggregate data, saved 14 hours weekly.

Conducted hypothesis testing and A/B testing through web analytics tools like Google Analytics, Facebook analytics

Performed ad-hoc tasks to generated reports and forecasts on web traffic, revenue and sales to marketing and sales team.

Collaborated with data engineer team for importing and exporting from RDBMS to HDFS using Hadoop, HBase, Hive


California State University-Fresno Fresno, CA

Master of Science, Major: Computer Science 08/2016 – 12/2018

Jawaharlal Nehru Technological University Hyderabad, India

Bachelor of Technology, Major: Computer Science 08/2012 – 05/2016

Personal Projects (Portfolio - Click Here )

Customer Segmentation (Demo: Link )

Developed customer segmentation on e-commerce customers data using MySQL to extract and integrate, Python to preprocess, visualize and trained using unsupervised learning techniques like KMeans and Gaussian mixture models.

The clusters helped simplify complex patterns of customer behavior, purchase patterns, web behavior to set strategies for customer retention, acquisition, spend and loyalty.

Customer Lifetime Value (Demo: Link )

Developed a predictive model using XGBoost to estimate lifetime value of each customer of an e-commerce business.

Clustered the customers based on the lifetime value to investigate the variables impacting customer LTV, through these insights derived strategies to increase the customer lifetime value, retention and churn.

Fraudulent Transaction Detection (Demo: Link )

Developed classifier on credit card transactional (time-series) data to detect the fraudulent transaction using Logistic Regression algorithm

Implemented sampling techniques like oversampling, under-sampling methods and Gaussian methods to add noise to reduce the imbalance in the data. Tuned the model using regularization techniques, and improved the model performance from 4% to 0.44% misclassifications on testing data

Customer Churn Analysis (Demo: Link )

Analyzed customer attributes like login patterns, spend time, likes, dislikes, purchases, complaints, etc. Built a classifier trained over XGBoost algorithm to predict the customer churn

Improved the model performance with F-score of 0.62 to 0.82 by implementing sampling techniques like oversampling, and under-sampling methods to reduce imbalance in the data; correlation analysis and t-test for feature selection

Price Forecasting (Demo: Link )

Implemented the ARIMA model to forecast oil prices. Conducted Augmented Dickey-Fuller test to evaluate the stationarity in the data. Reduced the stationarity using data transformations, decomposition technique to analyze Residual, Trend in the time-series data.

Tuned model parameters (p, d, q for ARIMA) using walk-forward validation techniques and PCAF, ACF analysis

Marketing campaign Analysis (Demo: Link )

Devised a predictive model to improve the targeted marketing strategy by analyzing the past campaign data of Portuguese bank customer data.

Used correlation analysis and t-test for feature selection. Trained the data over Adaboost algorithm and tuned the parameter to improve the performance of mean AUC score 0.81. This model helped to interpret the attributes and helped to draft the strategy and channelize the campaign to improve the customer base.

Contact this candidate