Dr. Guichong Li
SECURITY CLEARANCE
SECRET
EXPERIENCE SUMMARY
The main research focusing on Bayes learning; develop multiple layer Bayes learning/Bayesian statistics; high level Bayesian variable learning;
Latest research results for applications in anomaly traffic detection in CAN workflows; high level feature extraction using Bayesian variables; object detection and action classification;
Effectively designed and implemented machine learning algorithms for churn prediction and Add a Line analysis and machine learning pipelines for telecom business analysis.
Developed advanced machine learning algorithms for text, POS outlet items, category hierarchy classification; multilabel and multitask classification algorithms/text classification/NLP;
Developed advanced machine learning regression/classification algorithms for food component analysis using chemometrics and spectroscopy;
Recent postdoctoral research on uniformly and unbiased sampling/crawling online social networks using advanced Markov Chain Monte Carlo techniques; developed an innovative sampling algorithm, a new coupling technique, implemented by Ruby and Rails and Twitter API, DataMapper; Unix/Linux, Amazon EC2; social media analysis using Python, NLTK, SkLearn.
Previous postdoctoral research in DRDC, CORA, Canada, for Complex Dynamic Network Analysis; Simulation of Autonomous Underwater Vehicles; using MatLab and VB, etc.
Two year research contract in Health Canada for nuclear explosion and pollution monitoring, and environmental anomaly detection; using J# and Weka (Java) software package, Eclipse.
The main research interest focusses on Machine Learning and Data Mining algorithms and technology; in particular, one-class learning using kernel methods for anomaly detection and its application on big data using MapReduce/Hadoop with Pig/Hive/HBase, and advanced Markov Chain Monte Carlo techniques for fast and unbiased sampling/crawling online social networks such as Twitter and Facebook.
Having both Mathematics and Computer Science education backgrounds; 10 year professional experience for software development, artificial intelligence algorithm design, and leadership for transaction and database applications using SQL Server, ORACLE, C/C++, Java/J#, VB, JDBC, .Net., PowerBuilder, TCP/IP, OpenGL;
Research work on information security using anomaly detection techniques on Web server such as WebSphere/Weblogic with Spring, Swing, EJB, AJAX, SOAP; HTML5, NoSQL;
5 years big data flatform: spark/pypark/java/scala, Hadoop, Hive, NoSQL, Ab Initio;
EDUCATION
PhD of Computer Science, University of Ottawa, 2010
Master of Computer science, University of Regina, 2004
Bachelor of Mathematics, Southwest China Normal University, 1985
Guichong Li
#
Dates
Project
15
Aug, 2022-
AI developer/Python programmer, IRCC, Altis
Project I: IBM SPSS python conversion. Implemented end-to-end python pipeline with spark on AWS cloud for the original SPSS stream models. The main tasks include data collection, spss stream python conversion for spss type, filter, select, filler, merge nodes, aggregate nodes, flag nodes supernodes, cache, statistical outputs, with pyspark with spark on AWS cloud platform; unit tests and spark storage plan for optimization. Further, including techniques to transfer to Google Cloud Platform (GCP) services with BigQuery, DataFlow, Pub/Sub, BigTable, Data Fusion, DataProc, Cloud Composer, Cloud SQL, Compute Engine, Cloud Functions, and App Engine; BigQueryML, AutoML, Vertex AI
Technology:
Pandas, Pyspark(filter, count, columns, drop, withcolumrename), spark, spark, sql(to_date, col, lit, when, trim, substring, etc); spss (stream, filter node, filler node, type node, marge node, aggregate node, etc); GCP: BigQueryML, AutoML, Vertex AI.
Project II: fraud detection for financial applications. Implement and develop python tools and platform to automatically extract handwriting and signature, signature verification, handwritten and printed text classification, from docs and check images, using deep meaning models and image preprocessing methods, thresholding, connected component analysis; object detection and segmentation; remove noise from background; template matching, text extraction using keras ocr, tesseract, easyocr.
14
Jan, 2022-July, 2022
Data Scientist/Machine learning engineer, CN rail / SpruceInfotech
Developed and implemented Python scripts for data parsing, data imputation, and data encoding using sklearn, pandas. Developed Python scripts to train and build models and to run tests to evaluate system performance of AI solutions, using sklearn, pandas; developed and implemented python scripts to AI solutions for forecast modeling and regression modeling and classification modeling; developed and implemented python scripts for log time series analysis;
Developed MLflow for model tracking, training, logging, registration, inference, hyperopt/parameter sweep. Univariate/multivariate forecasting and regression
Analyzed and validated business requirements and review of solutions with relevant stakeholders; the technical report for MLFlow project development and production solution for azure cloud AI solution, and anomaly detection and sentinel; the research report for improvement of forecast models for rail transportation with Azure Devops and Databricks
Technology:
Pyspark, pandas, sklearn, MLflow, Azure databricks, Devops
13
Sept 2019- Dec, 2021
Data scientist, Solana networks
Project: Data Management, Labelling and Automation for Machine Learning; Activity Recognition and Hierarchical Labelling; OpenPose and HumanActionClassification platform
Context: image/video datasets downloaded from publicly-available sources or created by using annotation tools, and parsing and importing the data items into the MongoDB database (both image and video content). A tool supports the creation of training data collections in the database, subject to different criteria such as ImageNet, COCO, StanfordDrone. Finally, implement training set creation; and active learning and experimentation; reporting the results of a human pose estimation and activity recognition prototype solution by building on existing open source packages. validation with TIDE and PyBrisque; Deep learning and ML research on the 3-body problem in Comology
Technology and tools: Python, Pytorch, Tensorflow/CPUs and GPUs, ImageNet, COCO, StanfordDrone, VoTT, LabelMe annotation, boundingbox, segmentation, active learning, OpenPose, Yolov3, Resnet, HumanActionClassification, TIDE, PyBrisqueTechnology and tools: Python, Pytorch, Tensorflow, ImageNet, COCO, StanfordDrone, VoTT, LabelMe annotation, boundingbox, segmentation, active learning, OpenPose, HumanActionClassification, TIDE, PyBrisque
Project: Anomaly Detection for In-Vehicle Networks
Context: anomaly traffics (attacks) might be found in automotive bus system (CAN) which transmits signals between electronic control units (ECUs) as well as wireless interfaces such as GSM and Bluetooth. Attackers try to access the automotive network in order to inject messages, manipulate data or access confidential information. The task is to develop advanced ML algorithms to detect four attacks such as DoS, Fuzzy, Replay, Impersonation attacks in real-time with a low false positive rate and a high recall rate; implementation and testing python scripts/API.
Tools: python, Pandas, scikit-learn, statistical models, Spark/Hadoop, scala/pyspark, Hive, MS sentinel
Project: ML design for threat & intent detection system
Context: intrusion detection systems (IDSs) are used for network security by reporting network attacks and triggering alerts. The issue is that IDSs have been observed to trigger thousands of alerts per day, and thus high false positive rate. This makes it extremely difficult for the analyst to correctly identify the true positives. ML supervised learning methods have been successfully applied for network security by reducing false positives raised by IDSs. Further, the task also aims to intent attacks. For example, periodic network traffics might imply potential adversarial behaviors from botnets, which later launch network attacks. Therefore, Intent detection can be achieved by developing start of the art periodicity detection techniques. Research on anomaly detection on Cloud platform such as AWS with AWS Sagemaker
Technology: Kubernetes, Elasticsearch, Python, Pandas, scikit-learn, Naïve Bayes, PCA, randomforest, decision tree, clustering; cosine distance, segment distance, FFT, fuzzy naming, multihop ssh, Spark/Hadoop, scala/pyspark, Hive, python design patterns: global object, prebound, sentinel, creational, structural, behavioral patterns; AWS Sagemaker, AWS Glue, AWS Cloudwatch, Random Cut Forest; word and excel.
Project: ML techniques for improvement of ChatBot
Context: investigate cut edge ML techniques for sentimental analysis and intent classification for improvement of ChatBot. Techniques details: Entity Recognition, Stemming, lemmatization, vectorization for feature normalization and extraction. NLP package/tools: Spacy, NLTK, scikit-learn, Gensim. NLP pretrained models: Bert, OpenAI-GPT, ELMO, ALBERT.
12
Jan, 2021- Mar, 2021
AI developer, CRA Canada
Project: AWS architecture for QnABot development and deployment
Context: create and deploy QnABot with AWS Lex and AWS Alexa and ElasticSearch and Kendra and CloudWatch. Using CloudFormation previsions and manage stacks and resources specified in template code. Building responseBots with intents and slottype and fulfilment Lambda functions. Using AWS CDK and CloudFormation for infrastructure as code; CodePipeline for continuous delivery and CodeBuild to create AWS projects. Tasks also include model training and deployment in AWS; Deploy anomaly detection models to AWS for ClouldWatch and SageMaker service; AWS EKS service. Azure VM; ML modeling implementation and testing python scripts/API; GCP, etc
Tools: AWS, Boto3, Python, Node.js, java/javascript, angularJS, PyAthena, Boto3, S3, SQL-pandas, AWS data lake, CloudWatch, SageMaker, AWS Glue, Snowflake, Random Cut Forest (RCF), RecordIO Protobuf format, SageMaker.deploy; word and excel, csv_serializer, json_deserializer, eksctl, Fkubectl, Apache Flink; sentinel, Azure VM, workplace, Machine learning VM, GPU Cluster
11
June 2019- Nov 2019
Project2: Improvement of forecasting models by machine learning algorithm design with R; conduct research and analysis for pricing prediction using forecast models; visualization using shiny platform design for business analysis, (3 months),
Client: BigR.io LLC and JM Smucker
Subject: Improvement of forecasting models by machine learning algorithm design with R
Context: the data scientist team in JM Smucker has made much effort to build forecasting models to predict price and unit sales and build a shiny analysis flatform to help sale management. The challenge is how to improve forecast models while training sample contains much missing values and noise. As a result, forecasting models are subject to overestimation for prediction of price and unit sale amount, and thus suffer poor performance for deployment. Model implementation and testing python scripts/API.
Tools: python, pandas, scikit-learn, R, Shiny, and R ML package; java, AngularJS, Kubernetes, Elasticsearch.
Tasks and technical details:
Built forecasting models using Linear regression PCR and nonlinear regression models such as randomforest; combine with PCA for machine learning pipeline
Developed a new learning architecture based on Bayesian learning; combine Bayesian learning and PCA;
Developed a new algorithm for improvement of forecasting models by learning Bayesian variables using Bayesian methods;
Research on pricing prediction using forecasting models
Developed methods and experiments to create training sample for complex learning tasks including pricing predictions
10
Oct 2018- March 2019
Principal Machine Learning scientist/Engineer, BigR.io LLC
Project1: Predictive maintenance machine learning system in Porsche of Volkswagen Group, Stuttgart, German (6 months)
Client: BigR.io and Porsche of Volkswagen Group
Subject: Predictive maintenance using machine learning techniques.
Context: For maintenance in automobile industry, each vehicle generally goes to workshop for maintenance every six months. The issue is that some parts might be failed before next six month term for maintenance. These exceptional failures might lead to huge loss for customers as well as enterprises. In particular, the air spring in Porsche sport car is vulnerable to bad driving condition such as cold weather. People in automobile industries just wonder if machine learning techniques can be efficient and effective methods to predict maintenance issue such that mechanics can repair those flaw parts which might be damaged before the next maintenance. The same technology can be used to serve military equipment such as worship maintenance. Model implementation and testing python scripts/API.
Tools: python, pandas, scikit-learn
Tasks and technical details:
Develop advanced AI technology for predictive maintenance with Porsche of Volkswagen Group;
Machine learning pipeline: automate data imputation and data encoding; stability selection/recursive feature elimination; creating training data for predictive maintenance using XML parser
Recent research: mixture of Bayes with gaussian and kernel density
Develop mixture of Bayes algorithm, Python, scikit-learn, Bayes Theorem
9
Mar 2018-Aug 2018
Senior data scientist, FreedomMobile, Toronto
Projects: Rebuild five SAS models for Churn Prediction; design and train Add a Line prediction models, (6 months)
Client: Procom and FreedomMobile
Subject: Rebuild five SAS models for Churn Prediction; design and train new models for Add a Line prediction; in addition, conduct prototype design for forecasting models
Context: the data science team in FreedomMobile has built several SAS models for churn prediction for telecom business analysis using SAS tools; the challenge is that existing models suffer poor performance with low gini scores around and below 0.6 for churn prediction. It is required to rebuild existing SAS models using python and pyspark and zeppelin on spark with Hadoop cloud platform from generated training sample transformed from SAS database. Additional tasks are to build new models for Add a Line prediction and sale forecasting analysis.
Tools: python, pyspark, sciki-learn, MLib, pandas, zeppelin, Jupyter; other tools including tensorflow, keras, theano scikit-flow for deep learning; Django for web analysis.
Tasks and technical details:
The main research focusing on Bayes learning algorithms; published the recent work on Data Science and Big Data Analytics(DSBDA) 2018;
Design and implement Customer Churn Analysis using machine learning technology such as LogisticRegression, RidgeClassifier, RBF SVM, Neural Network, TensorDNN, PCA, Machine learning pipeline for data imputation and categorical encoding and algorithmic learning; various ML open source scikit-learn, Tensorflow, Keras, Theano, Django, R, SAS, etc;
Developed effective methods for data imputation and unseen missed values for categorical encoding;
Design and implement Add a Line for Telecom business analysis using Python, Pandas, SAS, Zeppelin Notebook, Spark/pyspark/Scala, Kafka, MLLib, Hadoop, Django, MatLab/Simulink, Micro services;
Prototype design for forecasting model analysis for item logistics
8
Apr 2015- Dec 2017
Principal Machine learning data scientist, NPD GROUP, Port Washington, NY
Projects: Walmart Harmony: POS data category hierarchical classification; second proj: Automated Classification
Client: NPD Group
Subject: Walmart Harmony: POS data category hierarchical classification
Context: NPD Group runs market research business which provides market analysis for retailers and product manufacturers by classification of retailer POS data over US market and Europe and Asia and workwide markets, into 45 business categories. The company has introduced ML techniques for automated classification. Due to hierarchical categories, how to do categorical hierarchical classification is still a technical challenge. Previous methods suffer poor performance and lead to high cost for demand of manual classification. After initial effort for investment on research and development, NPD Group decided to build a data science team and develop their own ML application for retailer POS category hierarchical classification system. Especially, in Walmart Harmony, NPD Group first built a ML system for category hierarchical classification with Walmart retailer POS data for Walmart company. Besides Walmart Harmony project, the second project is Automated classification, which is the ML application for retailer POS data classification for NPD Group
Tools: python, pandas, scikit-learn; and other open source such as tensorflow, keras, scikit-flow; NLTK
Tasks and technical details:
Work on machine learning algorithm research on Naive Bayes and randomization propagation;
Build two machine learning classification systems for large-scale POS data automatic classification;
Develop machine learning algorithms for hierarchical category classification, multi-label/multi-task classification;
Deep learning algorithms for text classification on retailer pos data such as Convolution Neural Network (CNN) and Word Embedding;
Published a novel statistical coupling technique: conditional independence coupling of Markov Chains, for sampling graphical networks, instead of traditional MCMC techniques.
Technology:
Lasso multitasks; logistic regression; Multinomial/Bernoulli Naive Bayes for text classification/NLP; CNN, word Embedding ; NLTK and Pandas for NLP; Naive Bayes incremental learning; PCA, clustering techniques for feature extraction; Scikit learn package, cx_Oracle and database application using Python, Pandas Hadoop, Spark with Pyspark/Scala, Hive, Hadoop; Caffe, Tensorflow, Keras, Django, R, Tableau, Ab Initio;
k-nearest neighbor, decision tree, Bayesian classification, support vector machine, neural networks, genetic algorithms, self-organizing feature maps, etc.
7
Dec 2014 - Mar 2015
Visit researcher, Sprott school of business
Project: big data for supply chain analysis and management, (4 months)
Client: sprott school of business
Subject: big data for supply chain analysis and management
Objective: big data platform for supply chain network analysis
Scope: analyze products and supply chain information on web for Canada product and manufacturers; build big data platform for supply chain management which help research projects in university
Tools: Java/javascript, python, web crawling library
Project periods: starting Dec, 2014 to March, 2015
Exact dates: Dec 1, 2015, to March 10, 2015
Role, visit researcher
Tasks and technical details:
Big data analytics and supply chain management; web crawling for large supply chain networks; graphic and network analysis; Statistical Markov Chain Monte Carlo; parallel MCMC for web crawling; develop advanced techniques for community detection/security;
Semantic web using RDF, Schema, microdata, ANY23; big data analytic using AWS EC2, S3, SQS;
Technology:
Web crawling methods; statistical Markov Chain Monte Carlo; Python; Weka, R, java/javascript, C++, ruby; AWS EC2, S3, SQS, ANY23, RDF, n-quads;
Informatics/ BDM/IDL/EIC/IDQ, Cassandra, Hadoop, Neo4j.
6
Sept 2014 –Nov 2014
Senior Machine Learning Specialist, TellSpec, Toronto
Project: food ingredient detection using machine learning techniques (3 months)
Client: TellSpec
Subject: food ingredient detection using machine learning techniques
Context: TellSpec is a Biotechnology company, which has developed a hand device to scan food surface and obtain spectral data about food ingredients. The data science team in Tellspec applied ML techniques to build models to identify 25 food ingredients such as eggs, sugar, protein, etc. This new technique may help people eating health and safe by eating right food and avoiding harmful and allergic ingredients. ML challenge is how to efficiently and effectively build ML models for multiple targets with an acceptable accuracy, which meets the strict requirement while the obtained performance is below the expectation for practical application. This is because training data sampled from practical environments might be subject to variance and noise.
Objectives: design and build robust ML algorithms for food ingredient detection for 25 targets
Scope: prototype design and implement ML algorithms given training sample collected from different brand bread sample in real environment.
Tools: python, pandas, scikit-learn,
Project period: Sept 12, 2014 to Nov 30, 2014
Role: senior machine learning specialist
Task and technical details:
Food component analysis: macronutrients (calories, carbo, proteins) and sugars( fructose, maltose, sucrose) using machine learning for chemometrics;
Developed advanced regression/classification machine learning algorithm for precisely predicting/detecting food ingredients;
Linear regression/generalized linear regression; PCA/PLS; Stacked Regression/Stacked PCA/PLS; Lasso L1/L2; feature extraction, using weka and Python, scikit-learning;
Technology:
Generalized Linear regression, PCA/PLS2/PLS1, Stacked Regression, Lasso L1/L2; Semi-supervised/supervised learning;
Weka, Python, scikit-learning, MatLab: statistics and machine learning toolboxes, neural networks, PCA, algorithms: k-nearest neighbor, decision tree, Bayesian classification, support vector machine, genetic algorithms, self-organizing feature maps
Spark/pyspark/Java/scala; Hive, Hadoop, NoSQL, MongoDB
5
Jul 2013– Aug 2014
Freelance data analytic consulting with GIRIH, Ottawa
Project: sampling graphical networks by conditional independence coupling of markov chains (13 months)
Work on Markov Chain Monte Carlo for scalable parallel computation for big data analysis; and MapReduce/Hadoop;
Work on anomaly detection techniques using kernel decomposition and Maximum Likelihood Estimation using the statistical package, R., Weka, RapidMiner;
Sentiment analysis using Weka and MatLab
Technology:
Pajek, networkx, Neo4j for social network analysis;
Kernel learning; Maximum Likelihood Estimation; One-class Support Vector Machine;
Markov Chain Monte Carlo, MapReduce/Hadoop;
Java/javascript, Weka, R, Java, C++, SQL/Oracle/Hive, MatLab: events and listener, control system, Fuzzy logic, image processing, neural networks, statistics and machine learning toolboxes, etc
4
Jun 2012 – Jun 2013
Postdoctoral Researcher, Computer Science of University of Ottawa
Project: sampling social networks using new markov chain monte carlo techniques (12 months)
Client: Girih company
Contact: Nathalie Japkowicz <********@********.***>
Context: local social media company Girih launched a research project to develop a new sampling technique for sampling social networks such as facebook and twitter; the resulting sample will be used for social network analysis. The previous sampling algorithms such as random walk, Metropolis-Hasting MCMC produce bias sample with much redundant and duplicate.
Objectives: research and develop new network sampling techniques for uniformly and unbiased sampling social networks
Scope: research and develop new network sampling algorithms using MCMC methods especially, coupling techniques, for uniformly and unbiased sampling social networks such as facebook and other types social networks.
Role: postdoc researcher
Project period: (12 months) from June, 2012 to June 2013
Task and technical details:
Developed a new algorithm for uniformly and unbiased sampling online social networks instead of traditional methods in applied statistics; the method overcomes the drawbacks such as a slow mixing time and biased results in traditional methods.
Supported by NSERC Engage Grant and SME4SME Grant as a sole researcher; developed an innovative and unique algorithm which applies advanced Markov Chain Monte Carlo methods such as coupling techniques for sampling large graph networks;
Extended traditional coupling algorithms such as perfect sampling; conducted experiments by sampling online social networks such as Twitter and small social networks; results show that the algorithm is extremely efficient to produces unbiased samples.
The algorithm implementation using Ruby, Twitter API, DataMapper, SQLite, MySQL, PostGres; running environment: Unix/Linux, Amazon EC2; designed a web application using Rails with MVC pattern for demonstration;
Performed social network analysis such as degree distribution, Centrality, Clustering coefficient, community detection. Initial research results have been published in IEEE ICDM International Workshop on Data Mining in Network, 2012.
Obtained a US patent for the initial research result as the original inventor.
Applied to Community detection and social media analysis using Python for NLP such as NLTK, SKLearn; RapidMiner, C++, R; tasks for text classification, sentiment analysis, term extraction.
Technology:
Advanced Markov Chain Monte Carlo (MCMC) methods and Coupling technique in applied statistics; Random Walk and Metropolis-Hastings algorithms; various convergence diagnosis methods such as Geweke Diagnostic; uniformly and unbiased sampling; online social networks;
Pajek, netowrkx, Neo4j, Ruby, Ruby and Rails, MVC pattern, Twitter API, DataMapper, SQLite, MySQL, PostGres;
Unix/Linux, Amazon Elastic Compute Cloud (Amazon EC2), EMR;
Python, NLTK, SKLearn, RapidMiner, C++, R; text classification, sentiment analysis, feature selection, term extraction.
3
Dec 2010 – Dec 2011
Postdoctoral Researcher, DRDC
Project: simulation of autonomous underwater vehicle, (12 months)
Client: DRDC
Contact: Nguyen, Bao <***.******@****-****.**.**>
Subject: simulation of autonomous underwater vehicle
Context: improve and upgrade tools for simulation of autonomous underwater vehicle. The project is to add new functionality into the previous developed tool written by VB; the result of the research project helps AUV optimal operation for mine detection under seabed.
Objectives: add new functionality such as different operation patterns of autonomous underwater vehicle and investigate and analyze the performance and accuracy improvement with new operation patterns
Scope: improve and upgrade VB tools for simulation of autonomous underwater vehicle with new operation patterns
Role: visit postdoc researcher
Tools: VB and MatLab
Project period: (12 months) starting from Dec 2010 to ending at Dec 2011
Level of effort: independent researcher under supervisor Bao Nguyen. In the meantime, my another research focuses on ML algorithms for anomaly detection using SVM algorithms; ML for web security and complex network analysis; published research paper on Canadian AI 2012.
Task and technical details:
Engaged in the development of the software tool for simulation of Autonomous Underwater Vehicles using MatLab and VB; Involved in research on Complex Dynamic Network Analysis; simulation using MatLab;
Utilized various artificial intelligence algorithms such as Genetic Algorithm (GA) in MatLab for simulating and computing the shorted path with the lowest cost in complex networks.
Implemented the Box-Muller algorithm to simulate mine distributions on the seabed as normal distributions; implemented algorithms to dynamically demonstrate the manipulation of Autonomous Underwater Vehicle (AUV).
Performed complex dynamic network analysis using various tools such as SNAP and Pajek; knowledge of the small world effect, degree distribution, degree correlation, centrality, clustering coefficient, community detection.
Developed a new fraud/anomaly detection algorithm using Java/javascript (J#/Eclipse), weka for machine learning, kernel methods, ensemble learning, one class learning; improved the traditional One-Class SVM algorithm implemented in LibSVM in Weka (a Java package for data mining and machine learning algorithms). This research work was published in Canadian Artificial Intelligence (AI) in 2012.
Designed one-class Naive Bayes algorithm for anomaly detection in big data using MapReduce/Hadoop, with Pig/Hive/HBase;
Also used anomaly detection techniques for information security by analyzing web log files and data transfer on WebSphere/WebLogic; EJB, AJAX, SOAP, XML, HTML5, NoSQL.
Technology:
Anomaly detection techniques, one-class learning algorithm, kernel methods, support vector machine algorithm;
Social networks; social network analysis; small world effect; degree distribution, community detection;
Big data; MapReduce/Hadoop, Pig/Hive/HBase, MatLab, VB, social network tools and package: SNAP and Pajek, Java(J#/Eclipse), SVM, Weka; EJB, WebSphere/Weblogic, AJAX, SOAP, XML, HTML5, NoSQL.
2
Sept 1987-Feb 2001
Instructor, Computer science of Zhengzhou University
Employed in Department of Computer Science, Zhengzhou/HuangHe University;
Teaching courses including RDBMS; C/C++;
1
Jul 1993-Feb 2001
Software Engineer, INSTITUTE OF COMPUTER RESEARCH AND APPLICATION, ZHENG ZHOU, HENAN, CHINA
Project: 1) Future Transaction Remote platform; 2) Accounting System (84 monts)
Context: develop commercial software for financial industrial application
Role: leader, software engineer
Tools: Powerbuilder, VB, C++, Oracle database, MySQl
Project period: starting from July 1993 to ending at Feb 2001
Tasks and technical details:
A part-time position; I was responsible of software development for practical applications;
As a project leader, I was involved in developing Remote Exchange System for future trade, mainly using TCP/IP, C/C++, SQL server, Unix/Linux, 1999 – 2001;
As a sole developer, I developed Accounting System for future trade, mainly using PowerBuilder and SQL server, 1994 - 1998; I designed and implemented and tested