Post Job Free
Sign in

Project Engineer

Location:
Buffalo, NY
Posted:
November 23, 2012

Contact this candidate

Resume:

Pradipto Das

Research Engineer / PhD Student, SUNY Buffalo, USA Address: 16 Flickinger Court, Apt. B, Amherst, NY 14228

Email: *****@*******.***, Web: www.buffalo.edu/~pdas3 Phone: 716-***-****

Core Proficiencies: Well-rounded technical knowledge on applying machine learning and natural language processing techniques

on artificial intelligence problems involving multimodal data

Professional Summary

Data Mining / Text Mining Expertise:

Over 5 years hands-on experience in exploratory data analysis with a focus on unsupervised graphical models for problems

involving topical analysis of data consisting of different modalities such as text and video

Video to text and text to video translation without using expensive manual frame-by-frame annotation; video event detection

using supervised classification techniques such as support vector machines, logistic regression etc.

Developing probabilistic browsing models from scratch for un-structured and semi-structured documents

Developing state-of-the-art multi-document summarization systems

Exposure to cluster computing through MPI and Hadoop s Map Reduce

Career Chronology and Accomplishments

I. Research Engineer, CSE Department, SUNY Buffalo, Buffalo, NY, USA (Spring 2011 current)

A. Project: Natural Language Based Multimedia Event Detection/Recounting (MED/MER)

Successfully completed a large project on translating videos to keywords and back without using expensive video annotation efforts

Accomplishments:

Eliminated the need for expensive frame by frame manual text annotations to describe the major contents of a video

Enhanced video clustering and search through natural language descriptions

System ranked first in TRECVID 2012 Multimedia Event Recounting track for matching videos on a given abstract event to

specific event descriptions based purely on predicted text

Joint research in collaboration with Honeywell ACS Labs, MN, Kitware Inc., NY, Stanford University, Simon Fraser University

and Georgia Tech University [Project funded by IARPA s ALADDIN program]

B. Project: Exploratory Data Analysis and Multi-document Summarization using Topic Models

Successfully formulated and implemented from scratch bi-perspective topic models that allow modeling of ubiquitous document

representations documents that incorporate both word level annotation classes and document level tags

Implemented a summarization system using bi-perspective topic models and document centric linguistic features that can

summarize multiple documents into one short bulleted list summary

Accomplishments:

Raised system performance to be at par with the state-of-the-art newswire summarization systems as per evaluations

based on Guided Summarization datasets from Text Analysis Conference

C. Project: Mining On-line E-learning Discussion Forums for Non-topical and Topical Analysis

Devised an algorithmic solution to identify text book concepts in e-learning discussion forum posts and thereby mapping the forum

posts to the table of contents in textbooks

Accomplishments:

Successfully applied topic models to discriminate between on-textbook versus off-textbook contents of the discussion

forums using domain knowledge from e-textbooks

Gained experience on Hadoop and Hive by writing simple data pre-processing methods

[Project funded by Apollo Group Inc., University of Phoenix Distance Learning School]

II. Research Intern, Janya Inc., Amherst, NY, USA (Summer 2010)

A. Project: Gibbs sampling based Topic Modeling Framework for the Semantex Text Analytics Processor Pipeline

Improved product capabilities by including corpus based solutions in addition to document/sentence centric models

IV. Visiting Research Fellow, Center for Soft Computing Research, ISI, Kolkata, India (Aug 2005 Jul 2006)

A. Project: DIET: Directional Entropy based Corner Detection in Gray-scale Images [Spring 2006]

Implemented different entropy measures on gradients surrounding edges in an image to detect corners

Accomplishments:

Proposed method performed at-par with the best geometric corner detection methods in terms of performance but with

lower compute time

V. Assistant Systems Engineer, Tata Consultancy Services Ltd. (TCS), Kolkata, India (Aug 2004 Jul 2005)

A. Project: TCS Kolkata Intranet Portal: Connect Kolkata

Development from scratch using J2EE architecture project was implemented using Jakarta Struts 1.1 framework, JSP

and Oracle 9i as Relational Database

VI. Project Intern, Machine Intelligence Unit, Indian Statistical Institute (ISI), Kolkata, India (Spring 2004)

A. Project: Statistical Outlier Detection in Large Multivariate Datasets

Utilized Tukey s Bi-weight estimator, robust Mahalanobis distances and Non-parametric Parzen window based

unsupervised density estimation to find outliers lying in the tail of the distance-from-median distribution of the data

Education

University at Buffalo, State University of New York, (SUNY Buffalo) USA

PhD, Computer Science Aug 2006 to present (Expected: Spring, 2013) [GPA 3.678/4.0]

West Bengal University of Technology, Kolkata, India

MCA (Master of Computer Applications) July 2004 [GPA 8.84/10.0]

Jadavpur University, Kolkata, India

BS (Honors) Mathematics July 2001 [First Division]

Publications

[6] P. Das, R. K. Srihari and J. J. Corso, Translating Related Words to Videos and Back through Latent Topics, Proceedings of the

Sixth International Conference on Web Search and Data Mining, WSDM, Rome, Italy, February 4-8, 2013 [oral presentation]

[5] P. Das and R. K. Srihari, Using Tag-Topic Models and Rhetorical Structure Trees to Generate Bulleted List Summaries

[submitted] [shorter version selected for oral presentation and appears in Proceedings of NIST Text Analysis Conference, Nov

2011, Gaithersburg, MD www.nist.gov/tac/publications/2011/participant.papers/UBSummarizer.proceedings.pdf]

[4] P. Das, R. K. Srihari and Y. Fu, Simultaneous Joint and Conditional Modeling of Documents Tagged from Two Perspectives, in

Proceedings of the 20th ACM Conference on Information and Knowledge Management (CIKM), Nov 2011, Glasgow, Scotland

[oral presentation]

[3] P. Das and R. K. Srihari, Learning To Summarize using Coherence, in Proceedings of NIPS Workshop on Applications for

Topic Models: Text and Beyond, Dec 2009, Whistler, Canada [poster presentation]

[2] P. Das and R. K. Srihari, Utterance Topic Models for Generating Coherent Summaries, in Proceedings of NIST Text Analysis

Conference, Nov 2009, Gaithersburg, MD [oral presentation]

[1] P. Das, R. K. Srihari and S. Mukund, Discovering Voter Preferences in Blogs using Mixtures of Topic Models, in Proceedings of

the Third Workshop on Analytics for Noisy Unstructured Text Data, Jul 2009, Barcelona, Spain [oral presentation]

Computer Skills

Programming Languages: Java, C++, MATLAB

Distributed and Large Data Processing Experience: Hadoop, Hive, MPI, Lucene

Scripting Experience: Shell scripts, Perl

Open source software: Stanford CoreNLP Suite, personal codes from www.buffalo.edu/~pdas3/software/software.html

Teaching Experience and Relevant Coursework

Teaching Assistant, CSE Dept., SUNY Buffalo, Buffalo, NY, USA (Fall 2006 Fall 2010)

Relevant Courses as a Teaching Assistant

Information Retrieval (CSE535) [Fall 2009/10] Machine Learning (CSE576) [Fall 2008]

Relevant Course Projects completed during Coursework

Parallel Latent Dirichlet Allocation (in C using MPI) Instructors: Vipin Chaudhary and Matthew Jones [Fall 2008]

Search engine architecture from scratch (in C++) Instructor: Rohini K. Srihari [Spring 2008]

Peer to Peer file sharing application (in Java) Instructor: Murat Demirbas [Fall 2007]

Neural networks for spam classification (in Matlab) Instructor: Matthew Beal [Fall 2006]

Awards and Honors

Research/Teaching Assistantship for PhD studies at SUNY Buffalo from Sep 2006 to present

Fellowship for the post of Visiting Research Fellow at Indian Statistical Institute, Kolkata, India Aug 2005 to Jul 2006

Certificate of merit and memento for standing First Class 2nd in MCA, Kolkata, India Jan 2005

Top Performer Award for batch (T-47) in the Initial Learning Program at TCS, Trivandrum, India Nov 2004

Govt. of India National Scholarship based on BS results at Jadavpur University, Kolkata, India Aug 2003

References: Available on request

2



Contact this candidate