Smruthi Mukund
E-mail: abo9p2@r.postjobfree.com
Phone: 508-***-****
Website: http://www.acsu.buffalo.edu/~smukund/
Research Interests
Areas: Information Retrieval, Natural Language Processing (NLP), Sentiment Analysis,
Emotion Detection, Recommendation Systems, Text Segmentation, Sequence Labeling,
Machine Learning
DISSERTATION
Title: NLP framework for non-topical text analysis in resource poor languages
Solved the problem of emotion detection in a resource poor language like Urdu. State of the
art Natural Language Processing tools required to perform this task in newswire data were
implemented using statistical machine learning techniques based on resource sharing,
bootstrap learning and transfer learning. There is little or no dependency on annotated
resources. As an application of this work, also analyzed the output of the Urdu emotion
detection system and identified interesting linguistic cues that aid understanding of the
linguistic relativity hypothesis.
Education
PhD, Computer Science, University at Buffalo, State University of New York, Feb. 2012
MS, Computer Science, University at Buffalo, State University of New York, Feb. 2008
BS, EE, Visveswaraiah Technological University, Bangalore, India, Jun. 2003
Key Accomplishments
Instrumental in procuring research funding from Apollo Group (University of
Phoenix School of Distance Learning)
Led a team of four to be the Panasci Technology Entrepreneurship Competition
2011
Research Experience (at SUNY Buffalo)
Project 1 Detecting student roles in eLearning discussion forums (Fall 2010 Spring
2011): Developed an active learning framework to detect student roles in e-Learning
discussion forums. Wrapper Induction technique combined with an information gain criterion
was used for feature selection to bootstrap the learning process.
Project 2 Generating a generic Part-Of-Speech (POS) tagger for Urdish data in Latin
script (Fall 2009): Developed a Support Vector Machine (SVM) based POS tagger that
automatically tags English words with English POS tags and Urdu words with Urdu POS
tags. This tagger uses a voting scheme to suitably combine the confidence measures from two
different SVM taggers.
Project 3 Detecting promise statements in political campaign speeches (Fall 2009
Spring 2010): Developed a Vector Space Model (VSM) that detects promise statements
made by political leaders in their campaign speeches. Features that were explored include
linguistic hedges, VerbNet features and rhetoric features that help identify with high
accuracy, statements that include a promise, pledge or conviction in campaign speeches.
Project 4 - Sentiment Analysis on Movie Reviews (Fall 2008): Developed a VSM model that
automatically classifies a collection of movie reviews based on the opinions expressed as
being positive or negative.
Masters Project Multilingual Information Retrieval (Spring 2007): Developed a Part Of
Speech Tagger for Hindi using Bikel s Ergodic Hidden Markov Model (HMM) that explores
linguistic and morphological details of the language.
Publications
Journals
An Information Extraction System for Urdu A Resource Poor Language. Special
Issue on Information Retrieval for Indian Languages, 2010, Smruthi
Mukund, Rohini K. Srihari and Erik Peterson
Non-Topical Text Analysis in Urdu Newswire Data. Submitted to the Journal of
Computational Linguistics, Smruthi Mukund, Rohini K. Srihari
Conferences/Workshops
Segmenting eBay Item Descriptions into Coherent Segments. Workshop on
Multilingual OCR and Analytics for Noisy Unstructured Data, 2011, Smruthi
Mukund, Nitin Indurkhya, Neel Sundaresan
Using Sequence Kernels to Identify Opinion Entities in Urdu. Proceedings of the
Fifteenth Conference of Natural Language Learning, 2011, Smruthi Mukund,
Debanjan Ghosh and Rohini K. Srihari
A Vector Space Model for Subjectivity Classification in Urdu aided by Co-Training.
Proceedings of the 23rd International Conference on Computational Linguistics, 2010,
Smruthi Mukund and Rohini K. Srihari
Using Cross-Lingual Projections to Generate Semantic Role Labelled Annotated
Corpus for Urdu - A Resource Poor Language. Proceedings of the 23rd International
Conference on Computational Linguistics, 2010, Smruthi Mukund, Debanjan Ghosh
and Rohini K. Srihari
Context Aware Transliterations of Urdu Names. Proceedings of the Seventh
International Conference on Natural Language Processing, 2009, Smruthi Mukund,
Erik Peterson and Rohini K. Srihari
Discovering Voter Preferences in Blogs using Mixtures of Topic Models.
Proceedings of the Third Workshop on Analytics for Noisy Unstructured Text Data,
2009, Pradipto Das, Rohini K. Srihari and Smruthi Mukund
NE Tagging for Urdu based on Bootstrap POS Learning. Proceedings of the Third
International Workshop on Cross Lingual Information Access: Addressing the
Information Need of Multilingual Societies (CLIAWS3), 2009, Smruthi
Mukund and Rohini K. Srihari
Non-Writer Specific Handwriting Generation for the CAPTCHA Application.
Proceedings of IEEE Signal Processing Society Western New York Image Processing
Workshop, 2007, Achint Oommen Thomas, Amalia Rusu, Smruthi Mukund,
and Venu Govindaraju
Press Coverage
Digitizing Urdu: Analysis of Documents, Social Networks in Pakistan's National
Language- by Audrey Watters, ReadWriteWeb.com, March 7, 2011, published by
The New York Times
Digitizing Urdu- by Ellen Goldbaum, University at Buffalo s News Center, March 3,
2011
News-oriented social network wins $10,000 tech competition- by Vincent Sherry,
April 27, 2011, published by Buffalo News, Business section
Entrepreneurship Competition Awards more than $10,000 to Info Tech Company-
by Jacqueline Ghosen, University at Buffalo s News Center, May 3, 2011
UB entrepreneurship competition announces winner- published in Buffalo Rising
and High Beam Research, May 12, 2011
New Trends in Data Mining and Analysis- by Rohini Srihari, United States Institute
of Peace (USIP), US State Department in Washington DC, Sept 16, 2011
Internships
Research Intern, eBay Research Labs, CA (May 2011 Aug 2011)
o Developed a novel clustering method to extract structured information from
eBay descriptions
o Involved in developing a modified POS tagger to handle noise in eBay
descriptions
Research Intern, Janya Inc, Amherst, NY (May 2010 Aug 2010)
o Developed a generic topic extractor for Urdu blog data that is based on an
incremental K-Means approach
o Developed an unsupervised language independent method for suffix removal
Research Intern, Janya Inc, Amherst, NY (May 2009 Aug 2009)
o Integrated Urdu NLP modules to SemantexTM a text extraction platform
o Developed transliteration system for Urdu Named Entities using phoneme
probabilities
Research Intern, EMC Corporation, Cambridge, MA (Jun 2007 Dec 2007)
o Involved in testing the Backup, Recovery and failover management of
PowerPath over different Arrays like CLARiiON, ALUA, Symmetrix, HP,
Invista devices connected to Solaris Hosts.
o Involved in development and maintenance of QE shared inventory tool called
ODB that is based on Perl/PHP/MySql (LAMPP)
Full Time Professional Experience
Senior Developer, Peertone Technologies, Bangalore, India (Jan 2008 Aug
2008)
o Involved in developing web service API using JAVA-Spring Framework
o Involved in optimizing query performance using PostGresSQL
Software Architect, Peertone Technologies, Bangalore, India (Jun 2008 Mar
2008)
o Developed the front end of OFFICESHARE client-side application using VC++
and MFC
o Integrated email client to OUTLOOK using MAPI
o Spearheaded entire client-side development and design activities
o Led a team of three and worked towards generating the web-client using
technologies like AJAX
o Managed the server side design and developed a client-server protocol using a
combination of HTTP, Multi-part form data and XML
Honors
Award and recognition for outstanding performance and lasting contribution given by
Mr. Jnan Dash, board member at Pandesa Corporation, 2005
Dr. Vijay Vittal award for academic excellence, BMS College of Engineering, India,
2003
Qualified for becoming the member of Mensa, a high IQ society with 98 percentile,
member, March 2002
All rounder s cup for academic excellence and participation in cultural activities,
Carmel Convent School, Jan 1997
Other Activities
Reviewer ACM Transactions on Asian Language Information Processing,
International Conference on Data Mining
Member of Toastmasters International, LA, USA
3rd degree in Usui system of Reiki Natural Healing
References
Dr. Rohini K Srihari, Associate Professor, Dept of Computer Science and
Engineering, University at Buffalo, State University of New York
o Email: abo9p2@r.postjobfree.com (preferred)
o Phone: 716-***-****
o PhD dissertation Advisor and CEO of Janya Inc.
Dr. Neel Sundaresan, Senior Director and Head, eBay Research Labs, San Jose, CA
o Email: abo9p2@r.postjobfree.com (preferred)
o Manager at eBay Research Labs.
Dr. Venu Govindaraju, SUNY Distinguished Professor, Dept of Computer Science
and Engineering, University at Buffalo, State University of New York
o Email: abo9p2@r.postjobfree.com (preferred)
o Phone: 716-***-****
o Advisor for Lectio Labs and course instructor