Post Job Free
Sign in

Engineer Design

Location:
Irvine, CA
Posted:
November 20, 2013

Contact this candidate

Resume:

SAYANTAN DASGUPTA

**** **** ***** ****

email id:

********@***.***

UC Irvine, Campus

********.********@*****.*** CA - 92617

Mobile no: 949-***-****

EDUCATION

University of California Irvine 2011-2013

Master in Computer Science

Coursework: 'Fundamental Algorithm', 'Computer Architecture', 'Information

Retrieval', 'Visual Computing', 'Machine Learning', 'Probabilistic

Learning', 'Data Management System', 'Bio-Informatics', 'Bayesian

Statistics'

Indian Institute of Technology, Kharagpur 2003-08

Bachelor & Master in Electrical Engineering

PROGRAMMING SKILLS

C, C++. Java, HTML, PHP/ Javascript, JSON, PL/SQL, Lucene, SOLR, Matlab,

R, Python, mlpy, numpy, scipy, NLTK, Amazon EC2, OpenMP, Hadoop, Greenplum

MPP, MADLIB, GraphLab, MySQL, Postgresql, Unix, MacOSX

DATA SCIENCE INTERN at Greenplum Inc

Neural Network for Sparse Data in MPP Database June - September 2012

. Implemented Back-Propagation with incremental training for multilayer

Neural Network in Greenplum MADLIB library (C++) on Greenplum MPP Platform

clusters (96 nodes, 6 core Intel Xeon 3.3GHz on each node, 50GB shared

RAM)

. Used selective weight update for back-propagation to reduce the

complexity of sparse data training.

PROFESSIONAL EXPERIENCE

Quantitative Analyst, Credit Suisse Business Analytics India 2010-2010

. Part of the Quantitative Risk Management Team

. Implemented VaR & Statistical Delta Risk Hedging model for Commodity

Futures and Options based on Black-Scholes-Merton pricing model

. Used future price, interest rate, option price etc. of each trading day

for an interval of 5 years to validate the models

Design Engineer (Texas Instruments India) 2008-2010

. Part of a challenging team for imaging software development of high

performance multimedia application device, OMAP4TM for smart phones.

. Designed and implemented the software framework, wrote manuals and

achieved challenging performance targets for the advanced imaging

algorithms & validated them on various platforms and hardware boards.

. Implemented multimedia frameworks like OpenMax for video codecs like

H264, VP6 decoders.

MS THESIS WORKS

Graphical Model for Recommendation System

. Created a Graphical Model for Collaborative Filtering, for explicit

feedback recommendation, e.g., where there is feedback available in a

scale 1-5 or 1-10

. Defined a Pairwise Markov network with each user representing each node,

and the pairwise CPD representing the count of the different rating common

to them, and implemented a parallel version of loopy BP for inference

. Implemented the code in C++ with parallel inference using OpenMP, and

executed on Amazon EC2 machine (32 virtual CPU with 60GB RAM)

ACADEMIC PROJECT WORKS

Design of a Credit Scoring System based on Random Forest Tree (Kaggle

Leaderboard)

. Designed a credit scoring system based on random forest tree to classify

the borrowers who are more probable to default, based on their income,

debt ratio and credit history

. A random forest based classifier was implemented to for the

classification, and it gave around 94% accuracy upon cross validation, and

a 86% ROC rate

. Our team UCI_combination came 6th out of 925 teams in the Kaggle "Give me

a CREDIT" competition.

Design of a Distributed Database Management System (CiteSeer Link)

. Designed and implemented a homogenous Distributed Database, using MySQL

servers as individual localized database servers, and a Java based

software acting as a middle layer.

. The data stored in MySQL servers located at different location could be

fetched by a single query through the DDBMS system

. The middle layer took queries input by users, parsed the query, sent the

fragmented queries to individual servers, retrieved the results from them,

joined them and produces the final result to the user. The query

processing remained transparent to the user, as if the all of data was

located in a single server.

Parsing, Alignment & Modeling of Genome Sequence (codebase)

. Parsed Genome Expressions to detect the genes present in a genome in

FASTA file format using regular expression libraries of Python

. Globally aligned two genes based on Dynamic Programing (Needleman-Wunsch)

Algorithm.

. Implemented Viterbi Algorithm & Posterior technique similar to Entity

Resolution for detecting the Type & Origin of the Genes. The Viterbi

Algorithm & Posterior Decoding was implemented in Python.

Automated Rating Prediction of Yelp Review (slides)

. Built an ordinal regression model for the predicting the rating in a

scale of 1 to 5 of yelp review from the review text

. The ordinal regression is done through SVM. We fit one SVM hyper-plane in

between two successive ratings, and a total of 4 SVM's to separate all 5

ratings

. The entire Yelp dataset contained around 230000 ratings, and we used 90%

data for training and 10% for validation, and we performed a 10-fold cross

validation. The worst-case prediction accuracy is 86%, along-with an RMSE

of 0.42.

Design of Search Engine using Lucene

. Designed a Lucene based space-optimized inverted index for our department

website (ics.uci.edu)

. Implemented positional indexing to enable search query with phrase

consisting any number of terms

. Used Anchor Text Mining to improve the NDCG from 0.4 to 0.67

. Developed a GUI for search query input based in JSP

EXTRA CURRICULAR ACTIVITIES

. Participated in raising funds for admitting children from slum areas to

schools, as a member of Texas Instruments India Foundation (TIIF)

. Participated in teaching children in Bangalore outskirt slum areas, as a

Volunteer of Teach-India Program by Times of India



Contact this candidate