Post Job Free
Sign in

Project Data

Location:
Milpitas, CA, 95035
Salary:
110000
Posted:
June 19, 2017

Contact this candidate

Resume:

Hemalatha Vakade

Milpitas, CA +1-408-***-**** ****.******@*****.***

https://github.com/hemavakade https://www.linkedin.com/in/hemalathavakade

Objective

I am a Data Scientist with an Engineering background and 8 years of prior consulting experience currently looking for full time opportunities. I am interested to work on projects using Machine Learning, Natural Language Processing and Deep Learning techniques.

Experience

DATA SCIENCE INTERN NFLATE INC. FEB 2017 – MAY 2017

SIMILAR IMAGE RETRIVAL:

nFlate Inc. provides a prescriptive analytics engine that automates the process of revenue optimization for online retailers via its proprietary app 'See It Buy It' (SiBi).

I worked on Image retrieval project which involved retrieving similar images from SiBi database matching the query image.

The initial step was to determine the dominant colors and normalize this to a standard palette which helped in representing the histograms of the images in the database. These histograms were stored for later use.

The project used Convolutional Neural Networks to cache the Image features followed by Approximate Nearest Neighbors model to sort the images by relevancy. These relevant images were then sorted according to the closest match of histogram (produced and stored initially) to the histogram of the query image. I used Python API in Tensorflow for this project and executed the programs on a GPU.

PRINCIPAL CONSULTANT ORACLE INC. (OFSS LTD.) JUL 2006 – SEP 2014

CLIENT: SILICON VALLEY BANK, SANTA CLARA, CA:

Consolidated new interface/reporting requirement, performed thorough analysis of the functional workflow and code to provide optimized solutions without affecting the existing workflow.

Worked on production fixes and deployed complicated interface requirements using performance tuned SQL queries.

Interacted with the core business group at the bank to gather requirements, and to analyze and fix issues encountered during implementation cycles such as QA and User Acceptance Testing.

CLIENT: CITIBANK

Was part of the Design and development for time-sensitive projects in critical Financial modules such as General Ledger, CASA (Current Accounts/Savings Accounts), Payment Workflow, Funds Transfer.

Supervised and coordinated with team members on time-sensitive projects. Performed peer reviews of project specifications.

Involved in requirement analysis and design optimization. Documented and presented vital documents such as Functional Specifications, Design Specifications, Program Specifications and Unit Test Plans to Development, QA and Testing teams.

Interacted with clients to obtain clarity of the requirements and suggest necessary changes. Awarded and recognized four times for contributions to projects during the product's periodic customization projects.

Projects

CLASSIFYING CAMERA TRAP IMAGES OF WILD ANIMAL SPECIES USING DEEP CONVOLUTIONAL NETWORKS:

The objective of this work was to build an image classification system for camera-trap images of wild animals. This project was carried out on data provided by Conservation International. Deep Convolutional Neural Networks (CNN) were used and different augmentations such as Batch Normalization and Dropout were explored.

The neural network was built using Keras API with Tensorflow and Theano Backend. Scikit Learn API was used to find the precision, accuracy and recall of the predictions on the test data. Plotly's python API was used to plot the results from the network.

The neural network was trained on a CUDA enabled GPU of g2 instance on Amazon Web Services (AWS). The neural network achieved an accuracy of about 96%.

SARCASM DETECTION IN COMMENTS:

The goal of the project was to detect presence of sarcasm in social media comments such as Reddit and Twitter using machine learning algorithms and NLP. Machine Learning algorithms from Scikit Learn library such as Multinomial Naive Bayes, Random Forest and Support Vector Machines (SVM) were used.

Feature Engineering was used to provide a context for better detection of sarcasm.

FORECASTING VOLATILITY USING GARCH MODEL:

The goal of this project was to understand advanced Time Series topic, Generalized Auto-regressive Conditional Heteroscedasticity (GARCH) and apply it to a financial data.

The dataset used was the closing prices for S&P 500 stock index obtained from Yahoo finance. I used GARCH (1,1) to model the volatility of the returns of the closing prices.

The final model had an AIC (Akaike Information Criteria) of 6.6856 and RMSE (Root Mean Square Error) of 0.08.

Education

MASTER OF SCIENCE, DATA SCIENCE UNIVERSITY OF NEW HAVEN

May 2017

Related coursework:

Advanced Linear Algebra for Data Scientist, Data Exploration, Feature engineering and Statistics for Data Scientists, Machine Learning and Data Analysis, Advanced Machine Learning, Unstructured data and Natural Language Processing, Distributed and Scalable Data Engineering, Advanced Statistics, Visualization and Communication, Data Science Entrepreneurism.

BACHELOR OF ENGINEERING, TELECOMMUNICATION VISVESVARAYA TECHNOLOGICAL UNIVERSITY

May 2006

Project Work:

Image Enhancement of Images taken by an Airborne Video tracker, using DSP processor:

This project was carried out at Defense Research and Development Organization (DRDO), India's military R&D agency. Algorithms such as Centroid Tracking and Image processing algorithms such as Histogram Equalization written in ASM were executed on a DSP processor and results verified using MATLAB code.

Skills & Abilities

DATA SCIENCE AND MACHINE LEARNING

Feature Engineering, Data Cleaning, Hypothesis testing, Supervised and Unsupervised learning, Regression, Principal Component Analysis (PCA), Singular Value Decomposition (SVD), Classification, Clustering, Support Vector Machines (SVM), Decision Trees and Random Forest, Time Series Analysis, Deep Learning, Recommender Systems, Naive Bayes, Churn Analysis, Non-Matrix Factorization (NMF).

TOOLS

Python, R Programming, SQL, JS, CSS, HTML

TECHNOLOGIES

Pandas, Scikit Learn, Tensorflow, Keras, MySQL, PostgreSQL, NoSQL, UNIX, AWS, Kafka, Cassandra, Spark, HDFS, Hadoop.



Contact this candidate