KARTIK BAPNA
OPEN TO RELOCATE
***************@*****.*** Linkedin:https://www.linkedin.com/in/kartikbapna/ Github:https://github.com/kartik1611 +1-716-***-****
Software Skills:Python, R, C, SQL, Spark, Apache Hive, Git ML Libraries: Numpy,Pandas,Scikit-learn,Keras,Tensorflow Machine Learning: Generalized Linear Models (Linear, Lasso,Ridge, Logistic), Classification algorithms (Support Vector Machines, Decision Trees, etc.),K-Means Clustering, Time Series, Ensemble Methods (Random Forests, XGBoost, etc.) Deep Learning: Feed Forward Networks, Convolutional Neural networks (CNNs), Recurrent Neural Networks (RNNs), LSTMs Statistical Methods: Maximum-likelihood estimation, Maximum-a-posteriori estimation, Hypothesis Testing, Probability Distributions, Inferential Statistics
Data Visualization: Tableau, Matplotlib, Seaborn, Ggplot2
EDUCATION:
Master's in Data Science University at Buffalo(SUNY)– Buffalo, New York,USA Graduation-09/01/2019 Bachelors of Engineering in Computer Science R.G.P.V UNIVERSITY - Bhopal, India Graduation-11/07/2011
EXPERIENCE:
Aleron Data Science Intern 06/03/2019–08/14/2019 USA
Carried out project which was to develop a systematic, data-driven approach to predict which high-value savings account customers which may move their money out of bank accounts, and provide insights about the same using Logistic Regression and Decision Trees. Net Link Software Group Associate Data Scientist 11/27/2017–05/23/2018 India
Core member of data science team, helped Telecom client identify the key focus areas and product preference trend based on existing customer retention markets using ML models such as Logistic Regression, Support Vector Machines and Ensemble Methods.
Performed extensive EDA among attributes with respect to target and COX Hazard Method with respect to account length (time)
Used Filter and Wrapper Method for feature selection of continuous and categorical variables.
Built models and analyzed user churn migration using attributes such as ARPU, Minutes of usage, Dropvce, Custcare, product purchase history etc., thereby helped in retention of the customer by 5%.
Conduct business analysis of the Average Revenue Per User (ARPU) bucket, traffic trends to gather critical inputs on consumer behavior for effective strategy formulation and customers revenue and usage behavior
Developed a query driven model based on customer life time value required for credit scoring thereby identifying the defaulters
Technologies: Python, MySQL, MS Excel, Numpy, Pandas, Scikit-learn (libraries) Sun Umbrella Data Analyst 06/09/2015–10/27/2017 India
Implemented statistical model using linear regression, to predict revenue as per the demographic location.
Performed A/B testing on user migration behavior to test the beta version of the website.
Generated report using Tableau by building complex stored procedure in My SQL.
Extracted data extensively by using SQL queries and used it for data mining tasks using R packages.
Performed EDA and created dashboards for visualization tools using Tableau
Migrated data from old data base to new database and pipelined data using visual studio
Technologies: MySQL, R Programming, SQL Workbench, Tableau. Cognizant Technology Solutions Programmer Analyst 12/22/2011–07/01/2015 India
Worked for US based banking client, for niche wealth management product (Eagle Investments) on Relational Databases.
Created Uploader for uploading data and Exporters for exporting data and generated reports
Technologies: MySQL, Eagle Investments(ETL Tool), SQL Workbench, Putty
Projects:
Drug Prior Authorization:
A prior authorization is an additional requirement that some insurance companies require before they decide if they want to pay for your medicine,our goal is to suggest doctor whether a prescribed drug is required or not,This helps doctor to prescribe medicine which does not need prior authorization and saves time for pharmacy, K-mean clustering was used for feature engineering and applied advance machine learning techniques and ensemble methods to predict whether a particular drug will require prior authorization or not.
Cervical Cancer Risk Analysis:
Worked on identifying the risk category of an individual based on multiple binary response columns of medical tests. handled imbalanced data & performed feature scaling as part of EDA. Implemented Random Forest and XGBoost and ensured model selection &optimization. Performed Chi Square test for feature engineering. Anime Recommendation Engine:
Built an anime recommendation system using collaborative filtering techniques, content based techniques, the data was gathered from my myanimelist.net and the end goal of the project is to suggest similar anime to the users. Chicago Food Inspection:
Chicago Food Inspection data was analyzed from Chicago Data Portal, The inspections promote public health in areas of food safety,sanitation and prevent the occurrence of food-borne illness, did EDA using seaborn and matplotlib libraries.
Certification:
Carnegie Melon University LTI - Big Data Optimization Certification - INSOFE Hyderabad 06/09/2016–12/16/2016 India