KETAKI MULAJKAR
Culver City,California *****● ****************@*****.*** ● +1-201-***-****
linkedin.com/in/ketakimulajkar● github.com/ketakimulajkar
PROFILE
A Computer Scientist with strong math background, passionate about Data Science, Big Data, Machine Learning, and Statistics.
TECHNICAL SKILLS
Software and Programming Languages: SQL, R, BASE SAS, Python, Hadoop, Spark, MATLAB, Linux, Oracle, Microsoft Excel, Tableau, C++, Java
Machine Learning: Classification, Regression, Clustering, Natural Language Processing, Dimensionality Reduction, Data Cleaning, Exploratory Analysis, Statistical Inference
Statistical Methods: Time Series, Regression Models, Bayesian Methods
Math: Linear Algebra, Probability and Statistics
Web Development: HTML5, CSS, Java Script
EDUCATION
Pace University- New York Jan 2015- Dec 2016
Master of Science in Computer Science GPA: 3.8
University of Mumbai- India Jun 2009- Jun 2013
Bachelor of Engineering in Electronics GPA: 3.5
CERTIFICATION
SAS Certified Base Programmer for SAS 9 (Serial Number: BP069195v9) March 2017
Johns Hopkins University - Coursera Data Science Specialization June 2016- Present
EXPERIENCE
Data Sciece/Software Engineer Intern: Heal Inc, Los Angeles Aug 2017- Present
Build complex SQL, PostGreSQL queries to create beautiful reports in PeriscopeData and present reports to team to improve campaign strategies and operations
Analyze campaign data on Google Analytics, Google AdWords and Segment to ensure the campaign programming were correctly implemented
Produce databases, tools, queries, and reports for analyzing, summarizing, and root causing board failure data. Versed in finding patterns and trends in complex, multivariable data sets
Interpret production and development databases to draw decisions for managerial action and planning
Use statistical techniques for hypothesis testing to validate data and interpretations
Propose solutions to improve system efficiencies and reduce total expenses
Database Programmer: Pace University, New York May 2017-Aug 2017
Pulled “Movie lens” dataset into the programming environment and divided it into three sub-datasets namely, Ratings, User and Movie
Extracted features of the users and class labels to build the binary classification model which had 60 per cent Precision rate and 75 per cent Recall rate
Used Random Forest algorithm to build the class and thus generated the top N-recommendation for users(based on Content-based Recommendation System)
Research Assistant: Swami Ramanand Teerth Marathwada University, India Jun 2013-Jun 2014
Facilitated gathering of materials and data for a project on “Web Data Mining using an Intelligent Information System Design” and wrote summaries and prepared tables, graphs, reports of project findings as requested
Performed qualitative analysis and quantitative analyses of data, using computer software such as SAS and Excel
PROJECTS
Breast Cancer Detection Feb 2017-Mar 2017
Built and analyzed different models such as K-Nearest Neighbors, Support Vector Classifier, and Logistic regression using Python (scikit-learn, pandas, numpy) to best classify the cancer to be benign or malignant
Employed PCA for dimensionality reduction and then compared the sensitivity for different thresholds (<0.25, 0.25 and 0.5), concluded that SVC performs better than the other two
Music Plagiarism Detection Aug 2016-Dec 2016
Analyzed different kind of techniques used for Audio Processing and Audio Fingerprinting using MATLAB and WEKA
Proposed a Similarity Measurement Method using Gaussian Mixture Model which works at measuring overall similarity between two different songs to detect cases of music plagiarism successfully
Superhero Social Network Jun 2016- Jul 2016
Analyzed the social network graph of Superheroes (Marvel Universe) using Apache Spark and Pyspark
Sorted the results based on the number of co-occurrences, thus finding the degree of separation between superheroes using BFS algorithm