System Project

Location:

Ann Arbor, MI

Posted:

December 10, 2012

Contact this candidate

Resume:

Eric Glover Ph.D.

E-mail: *********@**********.***,

Web:

http://www.ericglover.com/

Overview:

A strong academic background and a proven track record of delivering on

effective (user-facing and back-end) large-scale commercial systems in

the areas of (MLR) Machine Learned Relevance, Web-Scale categorization and system architecture and

algorithm development.

About two years of entrepreneurial experience as CEO of an angel funded startup.

More than a dozen years of commercial web search experience as summarized

below in the highlights.

Highlights:

Current Job: Fellow at Quixey.com

Quixey:

Advisor: Oct 2009 - August 2011

Fellow: August 2011 - Present

Starting as an early adviser, joined full time in August 2011 as a Fellow -

responsible for many things including driving the large-scale machine learning effort.

Currently responsible for search relvance, testing infrastructure, and assisting with the architecture.

My team designs, develops, evaluates and deploys new relevance features and functions, as well as implementing

various algorithms related to other search features including the autosuggest, query log mining, and others.

General - Entrepreneurial:

CEO and Co-founder of a small media search company, recieved angel funding

- Responsible for defining, designing and leading the implementation of

the entire back-end system, as well as managing the operations of the company.

CTO and Co-founder of Airtime High Tech Stuff, LLC., a commercial website development company

specializing in high-performance integrated website and system architectures.

Searchme:

Web-scale categorization system (CHOCO): Technical lead/primary

coder - large-scale categorization system (more than 1000 categories) -

run on Billions of documents in days. System included full web-based UI

for training, active learning, evaluation, ontology management and

automated error checking to aid Search Analysis/contractors.Vertical suggest/query intention mining: Leveraging full document

to category matrix, produced a ranked list of "vertical suggestions" in

real-time per search.

Multimedia blending: Technical lead - project ranked and index

multimedia (YouTube, Hulu, Imeem, Flickr and others) content. Solved

complex AI problems related to different features from text-web pages.

Implemented custom automated feed processing and defined techniques for

efficiently discovering appropriate pages to index.

Competitive analysis/judgment collection system (TORGO): Tehnical

lead - Internal web-based system for collect ingjudgments for

competitive analysis and MLR training. System included scrapers to pull

data from our competitors and caching system, easy to use UI. Flexible

design of code and DB schema to rapidly change user-judgment options.

MLR feature design and development: Various roles - coding,

system design, and formally defining features. Key accomplishments

include defining and design specification of XPATH/Perl pased system

for rapidly (hours) adding totally new features including use of

structured data.Near-real time feed processing system ("El Rapido"): Tecnical

lead/primary coder - near-real-time data inclusion from RSS feeds.

System included full process from fetching and parsing feeds to

indexing and generation of MLR features. Search Quality team managed

feeds and options (via Excel). Significant perceived relevance boost -

time from initial propsal to live under 3 weeks.

Lead projects on automatic page quality, spam, and many relevance

(DCG) improvementsPresented in board meetings, as well as involved in other

business meetings.

Ask.com:

Invited presenter at the NATO MMDSS conference in Gazzada, Italy,

September 2007At Ask.com, defined and created multiple core search,

entity-extraction and

classification-related technologies used by millions of users every day

Lead a team of engineers to develop and implement internal

infrastructure products for improved management of structured data,

classification, and data mining

NEC Labs:

Managed a team of several programmers and students to develop a

modular enterprise search technology architecture and demonstration

system capable of learning new categories in minutes, and performing

category-specific search over a variety of (unstructured) data sources.

Developed a prototype enterprise search system (Inquirus). This

system incorporated many new technologies including: rapid category

learning (active learning based), advanced feature selection,

SVM-based classification, automated query

modifications, intelligent resource routing, multiple-source

capabilities, automated query expansion, and search strategies

General - academic:

PhD in CSE/AI, focusing on preference-based web metasearch

Published several highly cited conference and journal

papers, as well as filed more than ten patent applications related to

data

mining/topic extraction, search engine architectures, and search

technologies.

Broad knowledge of computer science/AI, including algorithms,

software, and hardware (a master's degree in VLSI)Education

4/1994 BSE Electrical Engineering, Magna Cum Laude from University of

Michigan, Ann Arbor, MI

5/1997 MSE Electrical Engineering, University of Michigan, Ann

Arbor,

8/2001 Ph.D. Computer Science Engineering, University of Michigan,

Ann Arbor, MI Dissertation:

Eric J. Glover, Using Extra-Topical User Preferences to Improve

Web-Based Metasearch, Ph.D. Dissertation, University of Michigan,

2001.

Employment

8/2011 - Present Fellow at Quixey.com

is a new startup created to enable people to discover the applications

they need. Quixey is all about the next generation of search called Functional Search - instead of searching for apps by name or category,

search by what you want to do

1/2011 - 8/2011 CTO and co-founder of Airtime High Tech Stuff, LLC

9/2009 - 8/2011 CEO and co-founder of Intelligent Search Solutions Inc.

9/2009 - 8/2011 Adviser to Quixey.com

8/2009 - 10/2009 - Consultant for Lighthouse Capital Partners

Hired as a consultant to aid in the sale

of Searchme Intellectual Property (acquired by Lighthouse after

Searchme shutdown). Responsible for attending business meetings,

answering technical questions, and leading the effort to produce a

functional demonstration system.

3/2007 - 7/2009 - Principal Scientist/Classification Architect/Sr.

Staff Scientist at SearchMe.com

Initially hired to design and develop

the core categorization infrastructure. Work required design and

presentation in board meetings, as well as substantial coding. System

(see highlights above) was used to enable categorization of Billions of

web pages (in days) assigning more than 1000 possible labels. The full

document-category matrix was used at run-time for Vertical Suggestions

and relevance ranking, a key Searchme differentiating feature. System

required both offline and online components. Offline required ease of

use for Search Analysts to rapidly train (hours), as well as

active-learning, feature selection, and evaluation components. Online

system required new algorithms to maximize performance - for over 1000

categories with accuracy of non-linear classifiers ran with performance

approaching linear classifiers (one midrange server could do 10-20

pages per second (all categories). System also included integrated

ontology and deployment management - the Ontologist could manage which

categories should be user-facing, how categories relate, as well as

specific classifier options (external UI was text-lists like 'higher

recall' or 'higher precision', etc). Training system included a

distrubtued task manager which ran on six dedicated servers (with other

servers temporarily added as needed when there were too many jobs) and

required minimal maintenance, and virtually no configuration for

clients.

After completing most of the CHOCO system, I designed and lead a small

team to build and implement the TORGO user-judgment system. This system

was designed for contractors and Search Analysts to very rapidly make

judgments about query/url pairs. Unlike systems in use at other search

engines, this system collected significantly more data - users could

provide query or url specific judgments, could see cached pages, manage

the scraping. Scrapers could be easily added - with the final system

enabling non-engineers to add custom scrapers to controll production

system options. The initial system design assumed that the specific

menus, scrapers and judgments would be determined after project was

completed and launched, and had to be extremely flexible and reliable.

New types of judgments could be added in minutes - with no code changes.

Near real-time inclusion system (El Rapido) - Designed and coded a

system (managed by Search Quality) that would take an input RSS list

and for each fetch, process, and insert into internal document storage

system for rapid indexing. System was designed to ensure fresh and

fully-processed (classification, other metadata and page contents)

results were available for ranking in minutes. Work included managing

MLR features to ensure reasonable ranking - total project time under 3

weeks from initial suggestion of idea until live. Provided a

significant perceived relevance boost.

Multimedia blending - Tech-lead on key differentiating project that

would mix-in (organic ordering) multimedia content with regular web

(and news - see above). Project required defining features and XML feed

specifications (to both UI and Indexing engineers), writing custom feed

processing, and developing new MLR approaches. Resultant system was

able to include new sources, with features of different meanings, and

using very few (low thousands) judgments develop a distributionally

consistent MLR function (consistent with the separately trained regular

web-MLR). Key probelms solved: 1: Which features, 2: How to train

given less data and different feature 'meaning' (i.e. Page Views on

YouTube is different form Page Views on Imeem, and Hulu doesn't have

Page Views), 3: How to spider/locate content with resource constraints

(can't have all of YouTube, need to have videos for relevant queries),

4: Manage editorial preferences and content expiration (Hulu videos are

higher quality than un-official YouTube videos), 5: System must be

generic - to enable rapidly adding of future sources

Many other large-scale projects - please ask, highlights above

summarize several others.

2/2007 - 3/2007 - Manager (3) at Ask.com

Managed four engineers, playing an

active role

in design and development of new technologies in the areas of

classification, entity extraction, relevance, and structured data

management.

5/2004 - 2/2007 - Research Engineer (Software Engineer 5) at Ask.com

(IAC Search and Media formerly AskJeeves)

Recent work included leading a small

team of engineers to develop multiple internal products in the areas of

extraction, machine learning, and structured data management. Focus

included large-scale processing and analysis, highly-accurate

classification, and efficient algorithms. Previous work within ask

included developing highly visible, very high impact core technologies

- currently on the live site. Core technologies are in the areas of

entity extraction/classification, information extraction from

semi-structured data (including the Wikipedia), relevance and

disambiguation.

11/2002 - 4/2004 - Research Staff Member at NEC Laboratories America -

Project leader of the Inquirus project

I managed a team of three

full-time programmers/developers and several students, and participated

in several outside collaborations. I was responsible for creating new

research ideas and

communicating it to the development team for incorporation into our

Inquirus search system - implemented primarily in C++. The (new)

Inquirus search system is a modular architecture (several patents filed

on various aspects of the system) that included dynamic routing of

search resources (query processing, result processing and data

resources). Several demonstration systems were built, including a

MEDLINE based demo system demonstrating high-precision and high recall

for test medical queries. A second

demonstration system included built-in active learning for very rapid

category generation. An outside user (using the entirely web-based

interface) could train a custom search category (such as "Movie

Reviews", "Computer Science Papers", "Clinical Trials", "Executive

Bios", and others) in minutes. At project termination, the search

architecture included the ability to search in Japanese (including

proper word splitting), and process inbound data sources in multiple

encodings.

Research

and

technology highlights of the Inquirus project: System

utilized a new technology we invented called search strategies. Very

fast active learning for improved category creation. Efficient SVM

based classification. Real-time feature ranking and feature selection.

System

included modules for various data interfaces (including Web, Oracle,

MySQL, Z39.50). Multi-language/character encoding technology (Japanese

term extraction using Chasen).

Relevant

research: Automated methods for local hierarchy generation from

small document clusters. New methods for predicting

the generality or specificity of a document (improves relevance).

Technology for automatic discovery of

related medical concepts. Use of web structure to

improve classification accuracy and concept naming. Demonstrated

effective use of uncertainty sampling

with SVMs and use of

web structure (extended anchortext/anchortext windows) for extremely

accurate Yahoo document classification.

New method for web-graph modeling, incorporating local web communities.

New technology for improved

phrasal/concept extraction and concept grouping.

7/2001 - 10/2002 : Scientist at the NEC Laboratories America (formerly

named

NEC Research Institute), Princeton, NJ

Worked with Steve Lawrence, Gary Flake*

and C. Lee Giles* on improving

metasearch, and data mining. Continued dissertation work and developed

new methods for feature extraction/selection, and improved document

classification. Continued work on the Inquirus 2 prototype, and

participated in various research activities related to data mining.

*Gary and Lee left the laboratory prior to October 2002.

1/1999 - 6/2001: Intern at NEC Research Institute, Princeton, NJ

Collaborating with C. Lee Giles, Gary

Flake and Steve Lawrence on

improving and modeling

web metasearch. Involved in implementing a content-based metasearch

engine

that

considered more than just keywords . For more detailed information

please

refer to the publications below.

9/1998 - 12/1998: CAD GSI, University of Michigan, Ann Arbor, MI

Duties: Responsible for assisting

students with CAD related

questions or

problems. Supported: Mentor Graphics suite (Design Architect, Quicksim,

Accusim,

IC Station, Design Veiwpoint Editor), EPOCH, Synopsys, Verilog XL,

SignalScan.

Significant accomplishments include re-writing of the digital

transistor

models for the VLSI class. Helped to debug and prevent software

problems.

1/1995 - 8/1998: Graduate Student Research Assistant for the

University

Michigan Digital Library (UMDL) project, University of Michigan, Ann

Arbor,

Designed and prototyped multiple software

agents including the

Remora, WebAgent, and the Preference Agent. UMDL agents were written

primarily in

C++. Agents were developed in the CORBA framework under SOLARIS, and

required

extensive use of the Web. Wrote numerous CGI scripts in PERL, as well

other tools including web robots which automatically downloaded and

analyzed

web pages.

UMDL research focused on a distributed AI

(agent) architecture as a

basis

for a multi-purpose digital library. Library functions included

searching (both across and inside of) collections, document retrieval,

electronic commerce

and pricing, user interface and preferences. The UMDL project was used

provide content to local middle school and high school children as part

their science curriculum.

9/1994 - 12/1994: CAD Graduate Student Instructor (GSI), University

Michigan, Ann Arbor, MI

Duties: Responsible for assisting

students from many Electrical

Engineering classes in using the Mentor Graphics tool set. Aided

students in using Design

Architect, IC Station, Accusim, Quicksim and HSPICE. Responsibilities

included

problem solving and basic circuit debugging.

Publications:

Eric Glover, The

"Real World" Web Search Problem, MMDSS NATO Conference, Gazzada,

Italy, September 2007. .

Please

e-mail for an electronic copy of the actual paper.

Eric J. Glover, David M. Pennock, Steve Lawrence, and Robert

Krovetz. Inferring

hierarchical descriptions, Proceedings of the Eleventh

International Conference

on Information and Knowledge Management (CIKM'02), November 2002.

David M. Pennock, Sandip Debnath, Eric J. Glover, and C. Lee Giles. Modeling

information

incorporation in markets with application to detecting and

explaining

events, Proceedings of the 18th Conference on Uncertainty in

Artificial

Intelligence (UAI-2002), pp. 405-413, August 2002.

Eric J. Glover, Kostas Tsioutsiouliklis, Steve Lawrence, David M.

Pennock, and Gary W. Flake. Using web structure for classifying and

describing web

pages, Proceedings of the Eleventh International World Wide Web

Conference,

pp. 562-569, May 2002.PS PDF

Gary Flake, Eric Glover, Steve Lawrence, C. Lee Giles Extracting

Query

Modifications from Nonlinear SVMs, Proceedings of the Eleventh

International

World Wide Web Conference, May 2002.

David M.

Pennock, Gary

W. Flake, Steve

Lawrence,,

and . Winners

don't take

all: Characterizing the competition for links on the web, Proceedings

the National Academy of Sciences,

Volume

99, Issue 8, pp. 5207-5211, April 2002.,,,,,, Finn Årup

Nielsen,

Andries Kruger, and .

Persistence of web references in scientific research., 34(2): 26-31,

2001

Steve Lawrence, Frans Coetzee, Eric Glover, David Pennock, Gary Flake,

Finn

Nielsen, Robert Krovetz, Andries Kruger, and C. Lee Giles.

Persistence of Web References in Scientific Research, IEEE Computer,

vol 34, no 2, pp

26--31, 2001

Eric J. Glover, Gary W. Flake, Steve Lawrence, William P.

Birmingham, Andries

Kruger, C. Lee Giles, David M. Pennock. Improving

Category Specific Web

Search by Learning Query Modifications, Symposium on

Applications and the Internet, SAINT 2001, San Diego, California,

January 8--12, 2001.

Frans Coetzee, Eric Glover, Steve Lawrence, and C. Lee Giles. Feature

selection

in web applications using ROC inflections. In Symposium

on Applications

and the Internet, SAINT, San Diego, CA, January 8--12 2001.

Andries Kruger, C. Lee Giles, Frans Coetzee, Eric Glover, Gary

Flake, Steve

Lawrence, and Cristian Omlin. DEADLINER: Building a new niche

search engine.

In Ninth International Conference on Information and Knowledge

Management,

CIKM 2000, Washington, DC, November 6-- 11 2000.

Eric J. Glover, Steve Lawrence, Michael D. Gordon, William P.

Birmingham, C. Lee Giles, "Web

-- Your Way," Accepted to Communications of the ACM

Eric J. Glover, Steve Lawrence, William P. Birmingham, C. Lee Giles,

"Architecture

a Metasearch Engine that Supports User Information Needs,"

Eighth International

Conference on Information and Knowledge Management (CIKM 99), Kansas

City,

MO, November, 1999

Eric J. Glover, Steve Lawrence, Michael D. Gordon, William P.

Birmingham, C. Lee Giles, "Recommending

Web

Documents Based on User Preferences," in ACM SIGIR 99

Workshop

on Recommender Systems, Berkeley, CA, August, 1999

E. J. Glover, S.R. Lawrence, K.D. Bollacker, C.L. Giles, W.P.

Birmingham, G.W. Flake, "A Metasearch Engine Architecture That Supports

Individual Information

Needs," NEC Research Institute Technical Report, TR# 99-063, May 13,

1999

E. J. Glover, W. P. Birmingham, and M. D. Gordon, "Improving Web

Search Using Utility Theory," in Proceedings of the First International

Workshop on Web Information and Data Management, WIDM 98. Bethesda,

Maryland, 1998

Eric J. Glover, Sunju Park, Anil Arora, Daniel Kiskis and Edmund

Durfee, "A case study on the evolution of software tools selection and

development in a large-scale multi-agent system," in Workshop on

Software Tools for Developing

Agents, AAAI 1998. Madison, WI: AAAI

E. J. Glover and W. P. Birmingham, "Using Decision Theory To Order

Documents," in Digital Libraries 98, Pittsburgh, PA, 1998: ACM

D. E. Atkins, W. P. Birmingham, E. H. Durfee, E. J. Glover, T.

Mullen, E. A. Rundensteiner, E. Soloway, J. M. Vidal, R. Wallace, and

M. P. Wellman, "Toward Inquiry-Based Education Through Interacting

Software Agents," IEEE Computer, vol. 29, pp. 69-76, 1996

Patents:

Filed more than ten patents including those related to entity

detection/extraction, search architectures,

efficient data mining, medical concept extraction/relationship

discovery, improved metasearch performance, automatic hierarchy

generation and document cluster naming, improved document

classification techniques using web structure.

Hobbies

Digital photography, traveling, cooking, hacking (the good kind)

computer

security, and online gaming.

last updated: April 18, 2012

BibTeX

Contact this candidate