Sign in

Data Scientist

Cary, North Carolina, United States
January 21, 2018

Contact this candidate


Mazhar Kodithodika

Raleigh, North Carolina •27513•mob: 860-***-**** • PROFESSIONAL SUMMARY

Strong skills and 4-year experience in machine learning, statistical modelling and forecast modelling and finding insights.

4- year data wrangling experience using SQL, R and Python and data visualization using Tableau.

Designed and built models: market segmentation using K-means clustering, sales propensity models using tree ensemble methods Support Vector Machines and logistic regression, expected machine breakdown time prediction using survival regression, competitor price prediction using linear regression, supply chain and inventory optimization.

Strong Knowledge in Hadoop Big Data ecosystem, PySpark, Spark MLlib, Hive and Pig. EDUCATION

University of Connecticut, Hartford, CT Aug 2017

Master of Science in Business Analytics and Project Management-STEM (3.964/4.0) Big Data Analytics with Hadoop, Data Analytics with R, Predictive Modeling, Data Mining and Business Intelligence, Process Modelling and Data Management, Decision Modelling, Social Media Analytics, Project Cost and Risk Management. National Institute of Technology Calicut, India

Bachelor of Technology in Electrical and Electronics Engineering (8.03/10) 2006-2010 Sriram’s IAS: Coaching for Civil Service Examination with focus in Mathematics 2013-2014 TECHNICAL SKILLS:

Data Science: Machine Learning, RNN, CNN, K-NN, Linear Regression, Ridge Regression, LASSO regression, Principal Component Analysis, Clustering, Bayesian Method, SVM, Random Forests, Ensemble methods, Network Analysis, Natural Language Processing, Time Series Forecasting, Anomaly detection, fraud detection, Graph Analytics, Linear Discriminant Analysis. Tools & Computer Programming: R, Python, GIT, PySpark, Keras, TensorFlow, Pandas, Numpy, Scipy, Scikit, SQL, MATLAB, SAS, SPSS, Databricks, Linux, MapReduce, Gephi, Google Analytics, SAS enterprise guide, SAS JMP PRO, AWS Redshift, AWS S3. Databases and Big Data: Hadoop, Spark, Kafka, Sqoop, Pig, Hive, Oracle SQL, PostgreSQL, MS SQL server, ETL, SSIS. Visualization: Tableau, R shiny, R ggplot, R Plotly, MatPlotLib, Qlikview. WORK EXPERIENCE:(4 YEARS)

Bait-Al-Aseel GTC Group Kuwait• Analytics Engineer/Data Scientist Jan 2015-Aug 2016

Provided business recommendations and analytic insights by data exploration, statistical modelling and visualization.

Built Sales forecasting models and estimated potential market size for planning and strategic business development.

Developed Dashboard in Tableau to track sales, marketing and profitability KPIs and reported to senior management.

Increased sales conversion by developing sales leads classification model to identify favorable sales opportunities using Random Forest, Gradient Boosting, Support Vector Machines and Logistic Regression Models in Python.

Increased sales profit margin by price optimization by estimating competitor price using Linear Regression in Python.

Identify required secondary data and mined data from disparate secondary data sources using web scraping.

Build customer segmentation model using K-Means Clustering in Python that increased 18 percent ROI on marketing budget.

Find relevant records in massive dataset, query, join multiple datasets and advanced data manipulation using advanced SQL.

Increased conversion rate by A/B tests and optimized, an ecommerce website and its user experience.

Prepared raw data set by conducting a series of data preprocessing techniques like automated feature engineering using random forest, KNN imputation of missing values, outlier detection and data transformation. Larsen and Toubro Ltd India •Senior Engineer Analytics Jul 2010-Sept 2012

Predicted expected break down time of valves for inventory management using Survival Analysis that reduced 16% cost.

Formulated unsupervised learning model to identify clusters and analyze the geographic spread of trustworthy vendors.

Developed time series forecasting model to predict cables’ price in R that reduced inventory cost up to 200,000 USD.

Extracted and manipulated past vendor data using complex SQL queries for reporting and visualization.

Cost Regression Analysis of product price in R which resulted in 7% reduction procurement cost.

Collected raw data set and Modified raw data by Excel using functions like pivot table, vlookup, vba, macros etc. RESEARCH PAPER: Predictive model to diagnose Depression patients by analysis of EEG Signals using MATLAB

Using Surrogate Data Analysis, KS entropy, Lyapunov and Hurst Exponent, Fractal and Correlation Dimension and SVM. CAPSTONE PROJECT: Marketing Analytics for Basement Systems Inc and Treehouse Internet Group(Client)

Developed Geospatial model for market segment clusters using Neural Network, KNN, SVM, NLM and boosted model in R.

Aspect based sentiment analysis of reviews, found cross selling opportunity and visualized the clusters in Tableau Dashboard. ACADEMIC PROJECTS: Aug 2016-Dec 2017

Credit Risk prediction of loan applicant using Decision Tree Ensemble methods and Logistic Regression in SAS. Risk Financial modelling and parameter optimization for Solar power plants in USA using Monte Carlo Simulation Crude Oil Price Forecasting and Reaction to daily news sentiment analysis using ARIMA model and NLP packages in R Business Process Re-engineering and Data Management Architecture design for a trading company using Oracle SQL

Contact this candidate