Post Job Free

Resume

Sign in

Data Assistant

Location:
Buffalo, NY
Posted:
February 08, 2019

Contact this candidate

Resume:

SUHIT DATTA

716-***-**** ac8fms@r.postjobfree.com https://www.linkedin.com/in/suhitdatta GitHub: suhitd1729 EDUCATION

University at Buffalo, The State University of New York Master of Science Engineering Sciences (Data Science) GPA : 3.83/4.0 (Aug ‘17 – Dec ‘18) Indian Institute of Technology (Indian School of Mines), Dhanbad Bachelor of Technology Electrical Engineering GPA : 8.21/10 (Jul ‘11 – May ‘15) EXPERIENCE

Delaware North, Buffalo, New York Title : Data Science Intern Aug’18 - Dec ‘18

Designed a multi-layered Neural Network to predict volume of concession sales at sporting venues with a R-2 value of 0.947.

Performed Fan Interest Mining for supporters of NFL teams using Association Rules. Collected data using Tweepy (Twitter) and PRAW (Reddit) API. Identified key interests. Performed sentiment analysis of their Yelp/OpenTable reviews using NLP.

Developed a Random Forest Model to predict attendance at NHL events within 10% of actual turnstile.

Re-engineered the Transactional data extraction process from POS Business units using API and loaded them into Amazon WorkSpace servers using Python scripts. Reduced overall processing time by 30% due to automation. Oracle Financial Services Software Ltd., India Title : Associate Consultant (Software) Aug ’15 - July ‘17

Worked on Oracle Revenue Management and Billing(ORMB) platform, developed modules that could optimally analyze and process Gigabytes of billing data.

Responsible for Data Migration: Designed an interface that reduced processing time by about 50 percent from the legacy system; Received token of appreciation for a timely go-live. Oversaw multiple project phases : Requirements gathering to Production Implementation phase using PL/SQL, Java, JavaScript, HTML.

Developed multiple Financial and Banking Algorithms for clients such as Deutsche Bank and Australian Postal Corporation, directly interacting with them for requirements. Served as a point of contact with the client and the Off-site team. PROJECTS

Action cameras - Evaluating image quality and suitability for Machine Learning Project Assistant, Summer 2018 Extracted scenes from video recordings keeping the auxiliary information about the video capture event and feeding the key frames into a Generative Adversarial Network (GAN) model using PyTorch. Analyzed the impact of the sampling strategy on the quality of the generated images using Kernel Maximum Mean Displacement, Structural Similarity and 1-Nearest Neighbor methods. Work done in collaboration with Prof. Chandola and submitted as a part of research paper for ACM FAT* 2019 and AIES 2019 Conference.

Spotify Music Recommendation using Clustering Algorithms Spring 2018 Performed Analysis and developed an experimental recommendation system of songs included in Spotify playlists using audio features extracted from Spotify and Gracenote Web-API. Data was transformed using T-Distributed Stochastic Neighboring Embeddings (T-SNE) for dimensional reduction and fed into Hierarchical Clustering, K-Means, K-Medoids based models. NY Times News Classification and Twitter Feed Trend Analysis using Big Data Spring 2018 Used BeautifulSoup with NYTimes API and Twitter API to scrape data. Used NLTK library in Python for preprocessing. Hadoop MapReduce Algorithm used to extract key contents. Displayed results using D3.js as a Wordcloud. For the news data, Spark MLlib was used for News Classification using Logistic Regression, Random Forests and Naïve Bayes. >80% accuracy obtained. Human Resource Analytics (Kaggle) Fall 2017

Developed models in R to predict an Employees satisfaction level and retention using Regression and Classification techniques such as Linear Modelling, Bagging, Trees, Logistic Regression etc. Also identified the key parameters that affect satisfaction level. Exploratory Data Analysis of Chicago Taxi Trips Fall 2017 Obtained data from Chicago Data Portal, analyzed and drew inferences about the Trip Durations, Trip Costs and Taxi Companies, busiest Community Areas etc. Used MySQL, Python(scikitlearn, numpy, pandas) RELEVANT COURSES AND SKILLS

Courses : Linear Algebra, Machine Learning, Statistical Data Mining, Data Intensive Computing, Data Structures, Databases Languages : Python, R, MATLAB, Java, C, Oracle, MySQL, JavaScript, MongoDB, Hadoop, Spark, SAS, AWS Redshift Tools : Jupyter, R Studio, Eclipse, SQL Developer, Tableau, Docker, WinSCP, Putty, SAS Studio, Jira, WebLogic ML : Regression, Decision Trees, SVM, PCA, Neural Network, Deep Learning, NLP, Recommendation Systems, Comp. Vision Packages : Numpy, Pandas, Scikit-Learn, Matplotlib, Seaborn, Folium, PyTorch, Tensorflow, Keras, Mlextend, CV2, PIL



Contact this candidate