Engineer Sql Server

Location:

Gainesville, FL

Posted:

November 09, 2012

Contact this candidate

Resume:

Niketan Pansare

T 352-***-****

Website: http://www.cs.rice.edu/ np6/

B ***@****.***

Blog: http://niketanblog.blogspot.com/

Objective

To obtain a research internship in the eld of approximate query processing and applied

machine learning (especially for large-scale systems) for Summer 2013.

Education

Ph.D. in Computer Science, Rice University, USA.

2009 Present

Master of Science in Computer Engineering, University of Florida, USA.

2007 2009

Bachelor of Engineering in Information Technology, VJTI, Mumbai.

2002 2006

Post Graduate Diploma in Embedded Systems, Electronics Corporation of India

2003 2004

Limited, Mumbai.

Experience

Research intern, IBM Research, India.

Summer 2011

- Developed a novel topic model for spoken language (STM) that explicitly takes into account

uncertainties arising in speech-to-text translation.

- Link: http://www.cs.rice.edu/ np6/Papers/SpokenTopicModel.pdf (ICDM 12 paper).

Software Development Engineer Intern, Microsoft, Seattle.

Summer 2008

- Developed Table Analysis Tool for Cloud, which is a set of canned data mining tasks for

non-expert users using Microsoft Excel as front-end and SQL Server in the cloud.

- Link: http://tinyurl.com/9ph9br3

Software Engineer, MAQSoftware, Mumbai.

2006 2007

- Developed enterprise web applications using C#, ASP.NET Ajax and XML

- Developed data warehouse (Usage Reporting) for Microsoft using C# and SQL Server BI

Publications

Pansare N, Jermaine CM, Haas P, Rajput N. Topic Models over Spoken Language.

2012

IEEE International Conference on Data Mining (ICDM 12), December 2012.

Pansare N, Borkar V, Jermaine CM, Condie T. Online Aggregation for Large MapRe-

2011

duce Jobs. Proc. VLDB Endow., August 2011.

Sahay S, Rajput N, Pansare N. Social Ranking for Spoken Web Search. CIKM 2011.

2011

Arumugam S, Dobra A, Jermaine CM, Pansare N, Perez L. The DataPath system:

2010

a data-centric analytic processing engine for large data warehouses. ACM SIGMOD

10, June 2010.

Pansare N. Multi-query optimization in the Datapath system. Master s thesis, 2009.

2009

University of Florida, Gainesville, USA.

Tools

C, C++, Java, R, C#, Scheme, Common Lisp

Pgm Lang

Hadoop, SQL Server BI, Servlet, JSP, Ajax, ASP.NET

Technologies

1/2

Projects

- Implemented STM (see ICDM 12 paper) in C++ using GNU Scienti c library (GSL).

Spoken Topic

Model (STM) - CMU Sphinx4 speech-to-text engine was modi ed and data was generated by pro-

viding it with real-world audio les (TedTalks/Yale).

- The e ectiveness of STM was tested by comparing it to Latent Dirchlet Allocation

using o -the-shelf classi ers (SVMlight, SVMmulticlass and Weka).

- Modi ed Hyracks (Hadoop-like system) to provide necessary machinery for OLA.

Online

Aggregation - Dealt with Inspection paradox in a principled way to provide unbiased estimates

(OLA) using bayesian model implemented in C++.

- The overall system was then tested using Wikipedia tra c dataset.

- Data-centric database implemented from ground-up and tested on 10TB scale TPC-

Datapath

system H data-set.

- Developed multi-query optimizer in C++ to provide data-centric query plans.

- Yadmt: Tool to nd the best classi er for your dataset using statistical tests sug-

Non-research/

Personal gested in Machine Learning literature. Link: http://code.google.com/p/yadmt/.

- Voca: Voca is a desktop app (written in Java) that is designed to run in back-

ground, with minimal user interaction/interference, and that allows users to issue

voice commands. Link: https://www.facebook.com/voca.desktop.

For detailed listing of my projects/courses, see http://www.linkedin.com/in/niketan.

References

Peter Haas

IBM Research

Chris Jermaine

Rice University

2/2

Contact this candidate