Guichong Li
Software developer, Java, Ruby, Researcher in Artificial Intelligence, Data Mining, Machine Learning,
Big Data Analyst
******@****.*******.**
Summary
• Current postdoctoral research on uniformly and unbiased sampling/crawling online social networks using
advanced Markov Chain Monte Carlo techniques; developed a new sampling algorithm called conditional
independent coupling using Ruby and Rails and Twitter API, DataMapper; Unix/Linux, Amazon EC2.
• Previous postdoctoral research in DRDC, CORA, Canada, for Complex Dynamic Network Analysis;
Simulation of Autonomous Underwater Vehicles; using MatLab and VB, etc.
• Two year research contract in Health Canada for nuclear explosion and pollution monitoring, and
environmental anomaly detection; using J# and Weka (Java) software package, Eclipse.
• The main research interest focus on Machine Learning and Data Mining algorithms and technology; in
particular, one-class learning using kernel methods for anomaly detection, and advanced Markov Chain Monte
Carlo techniques for fast and unbiased sampling/crawling online social networks such as Twitter and Facebook.
• Having both Mathematics and Computer Science education backgrounds; 10 year professional experience for
software development, various artificial intelligence algorithm design, and leadership for transaction and
database applications using SQL Server, ORACLE, C/C++, Java/J#, VB, JDBC, .Net., PowerBuilder, TCP/IP,
OpenGL;
• Research work on information security using anomaly detection techniques on Web server such as
WebSphere/Weblogic with Spring, Swing, EJB, AJAX, SOAP; HTML5, NoSQL;
Experience
Independent Study at IT workshop
July 2013 - November 2013 (5 months)
Summer vacation;
IT workshop from Aug. 20 to Nov. 9:
Cross Culture; Social Media; Business Analysis;
Postdoctoral Researcher at University of Ottawa
June 2012 - June 2013 (1 year 1 month)
UNIVERSITY OF OTTAWA
Postdoctoral Researcher, 2012/6-2013/06
• As a sole researcher, my research task was to develop a new algorithm for uniformly and unbiased sampling
online social networks instead of traditional methods such as Random Walk (RW) and Metropolis-Hastings
Random Walk (MHRW) in applied statistics; the new algorithm overcomes the drawbacks such as a slow
mixing time and biased results in traditional methods.
• The research was supported by NSERC Engage Grant and SME4SME Grant; I was responsible of unfolding
Page1
successful development which eventually leads to a significant research result; thus I developed a novel
algorithm which applies advanced Markov Chain Monte Carlo methods such as coupling techniques for
sampling large graph networks;
• I achieved the goal by developing a new coupling technique called conditional independent coupling for
perfect sampling and extending traditional coupling algorithms; conducted overall experiments by sampling
online social networks such as Twitter and small social networks; results show that the algorithm is extremely
efficient to produces unbiased samples.
• The algorithm was implemented using Ruby, Twitter API, DataMapper, SQLite, MySQL, PostGres;
running environment: Unix/Linux, Amazon EC2; designed a web application using Rails with MVC pattern
for demonstration;
• Performed social network analysis such as degree distribution, Centrality, Clustering coefficient, community
detection. Research results has been published in IEEE ICDM International Workshop on Data Mining in
Network, 2012.
• Obtained a US patent for the initial research result as the original inventor, and the product was transferred
to the local company.
TECHNICAL AND THEORETICAL ASPECTS
• Advanced Markov Chain Monte Carlo (MCMC) methods and Coupling technique in applied statistics;
Random Walk and Metropolis-Hastings algorithms; various convergence diagnosis methods such as Geweke
Diagnostic; uniformly and unbiased sampling; ...
Postdoctoral Researcher at DRDC, CORA, canada
December 2010 - December 2011 (1 year 1 month)
DRDC, CORA, CANADA
Postdoctoral Researcher, 2010/12- 2011/12
• Engaged in the development of the software tool for simulation of Autonomous Underwater Vehicles using
MatLab and VB; Involved in research on Complex Dynamic Network Analysis; simulation using MatLab;
• Utilized various artificial intelligence algorithms such as Genetic Algorithm (GA) in MatLab for simulating
and computing the shorted path with the lowest cost in complex networks.
• Implemented the Box-Muller algorithm to simulate mine distributions on the seabed as normal distributions;
implemented algorithms to dynamically demonstrate the manipulation of Autonomous Underwater Vehicle
(AUV).
• Performed complex dynamic network analysis using various tools such as SNAP and Pajek; knowledge of
the small world effect, degree distribution, degree correlation, centrality, clustering coefficient, community
detection.
• Developed a new anomaly detection algorithm using Java (J#/Eclipse); it is achieved by developed a new
kernel method called ensemble kernel; the algorithm is typically one class classifier for machine learning; this
improves the traditional One-Class SVM algorithm implemented in LibSVM in Weka (a Java package for
various data mining and machine learning algorithms). This research work was published in Canadian
Page2
Artificial Intelligence (AI) in 2012.
• Used anomaly detection techniques for information security by analyzing web log files and data transfer on
WebSphere/WebLogic; EJB, AJAX, SOAP, HTML5, NoSQL.
TECHNICAL AND THEORETICAL ASPECTS
• Anomaly detection techniques, one-class learning algorithm, kernel methods, support vector machine
algorithm;
• Social networks; social network analysis; small world effect; degree distribution, community detection;
• MatLab, VB, social network tools and package: SNAP and Pajek, Java(J#/Eclipse), SVM, Weka; EJB,
WebSphere/Weblogic, AJAX, SOAP, HTML5, NoSQL.
Research Assistant at University of Ottawa
January 2006 - December 2010 (5 years)
UNIVERSITY OF OTTAWA
Research Assistant, 1/2006-12/2010
• Researched and developed anomaly detection techniques as a research assistance in University of Ottawa,
and cooperated with Health Canada for monitoring nuclear pollution, using Weka(data mining and machine
learning Java open source);
• Built the architecture consisting of severs and databases using J2SE, J2EE, J2ME for information retrieval
and management of samples and data synthesized in laboratory or collected from natural environment for
nuclear exploration and pollution; designed and developed the messaging framework for communication
between researchers over XML.
• Developed a new instance selection algorithm for supervised learning from large training datasets by
applying Markov Chain Monte Carlo methods; designed a border identification algorithm to identify borders
from training datasets; research results were published in International Conference on Data Mining (ICDM)
in 2008;
• Developed various anomaly detection algorithms by re-designing one-class learning algorithms based on
traditional supervised learning algorithms such as Naive Bayes, Bayesian Networks, k-Nearest Neighbor,
k-mean, Parzen density estimation; algorithms were implemented using Java/J#; Weka (data mining and
machine learning Java open source); this work was published in Conference on Intelligent Data
Understanding(CIDU) in 2010.
TECHNIQUES:
• Data Mining and Machine Learning algorithms, Naive Bayes, Bayesian Networks, k-Nearest Neighbors,
k-Mean, Generalized Linear Model, SVM, OCSVM; using Java/Eclipse, Weka.
• J2SE, J2EE, J2ME, WebSphere/Weblogic, AJAX, and SOAP, Jini, XML/XSLT, JAXB, SAX/DOM,
PersonalJava, JavaSpaces, HTML5, NoSQL.
Research assistant at University of Regina
February 2001 - December 2005 (4 years 11 months)
UNIVERSITY OF REGINA
Research Assistant, 2/2001-12/2005
Page3
• Researched on association analysis large transaction datasets; developed new knowledge discovery
Algorithms; analysis of Web log files for user behaviors, using Java and python;
• Researched on animation simulation and crowd behavior manipulation using C++ for autonomy and
intelligence using artificial intelligence technology such as neural networks, etc;
• Published two research papers: basic association rule, searching for pattern rules, using Java;
• Teaching experience of Java programming course.
TECHNICAL AND THEORETICAL ASPECTS
• Various similarity and interestingness measures: Gini Index, Support, Confidence, Conviction, Cosine,
Laplace, Interest, Jaccard, Shannon entropy, Piatetsky-Shapiro, Kullback-Leiber diversity, Goodman and
Kruskal, Pearson correlation coefficient, J-measure, Euclidean distance, etc;
• Knowledge discover and data mining algorithms; scalable and heuristic search algorithms such as mining
association and interesting rules on large transaction datasets;
• Java, C++, Weka: data mining and machine learning package.
Software Engineer at Institude of Computer Research and Application
January 1993 - January 2001 (8 years 1 month)
# A part-time position; I was responsible of software development for practical applications;
# As a project leader, I was involved in developing Remote Exchange System for future trade, mainly using
TCP/IP, C/C++, SQL server, 1999 – 2001;
# As a sole developer, I developed Accounting System for future trade, mainly using PowerBuilder and SQL
server, 1994 - 1998; I designed and implemented and tested all accounting computational procedures; easy
and friendly manipulation and maintenance;
# Obtained all experience for software development, test, documents, maintenance, and sale;
# Both software products obtained great success in market; the products gained the highest market share.
Technical aspects
• TCP/IP, C/C++, SQL server, PowerBuilder
Instructor at zhengzhou university
September 1987 - January 2001 (13 years 5 months)
# Employed in Department of Computer Science, Zhengzhou/HuangHe University;
# Teaching courses including RDBMS; C/C++;
Education
University of Ottawa
Doctor of Philosophy (PhD), Computer Science, 2006 - 2010
Activities and Societies: Tamale Seminar
Page4
University of Regina
Master's degree, Computer Science, 2001 - 2004
Activities and Societies: Tamale Seminar
Southwest China Normal University
Bachelor's degree, Mathematics, 1981 - 1985
Page5
Guichong Li
Software developer, Java, Ruby, Researcher in Artificial Intelligence, Data Mining, Machine Learning,
Big Data Analyst
******@****.*******.**
Contact Guichong on LinkedIn
Page6