Data Computer Science

Location:

Dallas, TX

Posted:

January 18, 2018

Contact this candidate

Resume:

Resume of Piyush Supe

Page * of *

Piyush Supe

Dallas, TX, 75252 469-***-**** ******.****@********.*** www.kaggle.com/pgsupe https://github.com/impiyushs Linkedin EDUCATION

Master of Science in Computer Science CGPA: 3.61/4.0 (Expected May 2018) The University of Texas at Dallas, Richardson, Texas. Coursework: Big Data, Machine Learning, Statistical Methods for Data Science, Computer Vision, Design and Analysis of Algorithm, Database Design, Web Programming Languages.

Bachelor of Technology in Computer Science and Engineering CGPA: 7.98/10.0 (2012 – 2016) Dr. Babasaheb Ambedkar Technological University, India. SKILLS

Languages: R, Python, Java, Scala, C++

Database: SQL, NoSQL MongoDB, Cassandra HBase

ML Packages: R libraries, Scikit- learn, Spark MLlib, Tensorflow

Big Data: HDFS, MapReduce, Spark, Scala, Pig, Hive

Data visualization: R-ggplot2, Matplotlib, MS Excel

Applications: R studio, Orange, IDLE, Pycharm, Eclipse, Oracle sql developer NON ACADEMIC PROJECTS

Digit Recognizer (Python Tensorflow) (April 2017)

Implemented Softmax Regression model in python to recognize handwritten digits on MNIST data. Accuracy was ~85%. New York City Taxi Trip Duration (Kaggle competition- Top 30 % in Leaderboard) (Oct 2016)

Preprocessed data and implemented Adaboost in python Neural net model in R (March 2017)

Preprocessed the adult dataset from UCI ML repository.

Created a neural network using neuralnet, nnet package in R. Movie website (Python and html) (Oct 2016)

Developed a responsive static website with functionality of playing the movie trailers from youtube. ACADEMIC PROJECTS

Instacart Market Basket Analysis (Kaggle competition- Big data project) (May-August 2017)

Implemented Random Forest, Decision Tree and Gradient Boosting using Spark MLlib to recommend products.

Developed on databricks using PySpark and Spark MLlib. The F score came out to be ~0.89 in all three models. KDD cup 2011 (Yahoo music dataset- Machime Learning project) (March-May 2017)

Implemented collaborative filtering (user based and item based) for music recommendation in R.

Implemented Adaboost, NeuralNet in R and compared the models. K means clustering (April 2017)

Implemented K means clustering on images to form cluster on images in python using open CV.

Implemented K means algorithm for tweets clustering on json file of tweets using Jaccard distance in python. Implemented 12 different supervised classifier to compare and study in R (March 2017)

The data set used is Classification type multivariate dataset using various R libraries in R studio

It is mainly to study different classifiers and R libraries and their difference in the working and accuracy of each. Implemented Multinomial Naive Bayes Classifier from scratch in python (March2017)

20-news-bydate dataset is preprocessed using nltk library in python

The accuracy came out to be more than 80% on test data. USA Election results map (Statistics project) (Feb 2017)

Preprocessed the data set of election results and created a map of USA in R using ggplot2 library showing the results of election. Implemented ID3 (Machine learning) algorithm from scratch in java (Feb 2017)

Implemented tree data structure to make Decision tree.

A sample dataset was preprocessed and used. It was found that the accuracy was more than 85%. Wink, Shush (finger silence) detection in open CV (Computer vision project) (March2017)

Used haar features to detect face and eyes, lips for wink/ shush detection in C++ using open CV.

Implemented to detect on live video as well as pictures.

-Work Authorization: F1 visa.

Contact this candidate