Resume of Piyush Supe
Piyush Supe
Dallas, TX, 75252 469-***-**** ******.****@********.*** www.kaggle.com/pgsupe https://github.com/impiyushs Linkedin EDUCATION
Master of Science in Computer Science CGPA: 3.61/4.0 (Expected May 2018) The University of Texas at Dallas, Richardson, Texas. Coursework: Big Data, Machine Learning, Statistical Methods for Data Science, Computer Vision, Design and Analysis of Algorithm, Database Design, Web Programming Languages.
Bachelor of Technology in Computer Science and Engineering CGPA: 7.98/10.0 (2012 – 2016) Dr. Babasaheb Ambedkar Technological University, India. SKILLS
Languages: R, Python, Java, Scala, C++
Database: SQL, NoSQL MongoDB, Cassandra HBase
ML Packages: R libraries, Scikit- learn, Spark MLlib, Tensorflow
Big Data: HDFS, MapReduce, Spark, Scala, Pig, Hive
Data visualization: R-ggplot2, Matplotlib, MS Excel
Applications: R studio, Orange, IDLE, Pycharm, Eclipse, Oracle sql developer NON ACADEMIC PROJECTS
Digit Recognizer (Python Tensorflow) (April 2017)
Implemented Softmax Regression model in python to recognize handwritten digits on MNIST data. Accuracy was ~85%. New York City Taxi Trip Duration (Kaggle competition- Top 30 % in Leaderboard) (Oct 2016)
Preprocessed data and implemented Adaboost in python Neural net model in R (March 2017)
Preprocessed the adult dataset from UCI ML repository.
Created a neural network using neuralnet, nnet package in R. Movie website (Python and html) (Oct 2016)
Developed a responsive static website with functionality of playing the movie trailers from youtube. ACADEMIC PROJECTS
Instacart Market Basket Analysis (Kaggle competition- Big data project) (May-August 2017)
Implemented Random Forest, Decision Tree and Gradient Boosting using Spark MLlib to recommend products.
Developed on databricks using PySpark and Spark MLlib. The F score came out to be ~0.89 in all three models. KDD cup 2011 (Yahoo music dataset- Machime Learning project) (March-May 2017)
Implemented collaborative filtering (user based and item based) for music recommendation in R.
Implemented Adaboost, NeuralNet in R and compared the models. K means clustering (April 2017)
Implemented K means clustering on images to form cluster on images in python using open CV.
Implemented K means algorithm for tweets clustering on json file of tweets using Jaccard distance in python. Implemented 12 different supervised classifier to compare and study in R (March 2017)
The data set used is Classification type multivariate dataset using various R libraries in R studio
It is mainly to study different classifiers and R libraries and their difference in the working and accuracy of each. Implemented Multinomial Naive Bayes Classifier from scratch in python (March2017)
20-news-bydate dataset is preprocessed using nltk library in python
The accuracy came out to be more than 80% on test data. USA Election results map (Statistics project) (Feb 2017)
Preprocessed the data set of election results and created a map of USA in R using ggplot2 library showing the results of election. Implemented ID3 (Machine learning) algorithm from scratch in java (Feb 2017)
Implemented tree data structure to make Decision tree.
A sample dataset was preprocessed and used. It was found that the accuracy was more than 85%. Wink, Shush (finger silence) detection in open CV (Computer vision project) (March2017)
Used haar features to detect face and eyes, lips for wink/ shush detection in C++ using open CV.
Implemented to detect on live video as well as pictures.
-Work Authorization: F1 visa.