Post Job Free

Resume

Sign in

Power Plant Data

Location:
Hicksville, NY, 11801
Posted:
June 12, 2013

Contact this candidate

Resume:

Lili Cheng

Email: ab9ct7@r.postjobfree.com Phone: 914-***-****

Background: 5+ years of experience in data analysis, statistical

techniques and software tools (SAS, R, Matlab, Minitab), time series

analysis and forecasting models, Data Mining, Machine Learning

methodologies and analysis. 5+ years of experience in extensive software

development in Java, C++, SQL, Visual C#. 5+ years of experience in

UNIX/LINUX system administration and Shell scripting. Data analysis and

solutions development for several industries including Finance, Energy,

Genomics, Image recognition. Reasonable experience in big data platform

and analysis (Hadoop, MapReduce).

Data Analysis Skills

. 5+ years of experience with statistics and mathematics methodologies

such as forecasting, time series analysis, regression, demand

forecasting, Computational Biology, Finance, etc.

. Extensive data analysis using statistical tools including SAS, R,

Matlab, Minitab, Treeview, Cluster.

. Extensive knowledge and real world experience in applying Machine

Learning and Data Mining approaches for data analysis.

Software Development Skills

. 5+ years of experience in software architecting, and application

development in Java, C++, SQL, Visual C#.Net, Visual C++, Visual Basic.

. 5+ years of experience in UNIX/LINUX system administration, shell

scripting, and software development.

. Significant experience in development of solutions for domains

including Image Recognition, Finance, Computational Biology where

various Artificial Intelligence, data mining, statistics, and machine

learning methods and solutions are used.

. Reasonable experience with big data platform and programming (Hadoop,

MapReduce)

Experience

Data Scientist, Novisync Solutions Jan.2013-present

. Big Data Search Optimization

- Developed Hadoop based web and text analytics optimization for big

data application. Designed and adapted PageRank algorithm with

optimizations. Experience with configuration, management of

Hadoop, HDFS clusters and MapReduce based algorithm development

under UNIX/LINUX environment.

Data Researcher, Dept of Statistics and mathematics at Univ of

Massachusetts at Amherst Sep.2006-May.2008

. Gene Data Analysis using Hierarchical Clustering

- Experience in genomics with large scale gene data set analysis for

efficient computation. Goal was to find similar DNA micro-arrays

in extra large gene data set with high efficiency. A hierarchical

clustering algorithm-- agglomerative method was employed after

experimenting with several clustering and pattern matching

algorithms. Also experimented with several distance metrics and

linkage methods in order to find the most efficient implementation

of clustering algorithm. Implemented in R, Treeview and Cluster

under Windows environment.

. Bankruptcy Prediction using Measurement Error Analysis

- Experience with risk assessment with financial data. Goal was to

predict the probability that a company goes bankruptcy based on

its cash-flow and debt history. Learning algorithms were used to

extract patterns from large set of labeled historical financial

data of companies. Then M.E correction and bootstrapping analysis

were used to identify patterns from the financial data of the

target company. Designed and implemented in SAS under UNIX

environment.

. Human Face Recognition using Support Vector Machine Algorithm,

Microsoft Research in China Oct.2004-

Apr-2005

- Significant experience with digital image processing for face

recognition. Goal was to recognize specific human faces from live

video streams from airport civilian cameras. Various digital image

processing methods were used to prepare and enhance the raw

images. The SVM based algorithm was employed in recognition

process in order to handle high feature dimensions of human face

images. Implemented in C# and used several image processing

libraries under Windows environment.

Data Research Analyst, Hai Dian Hospital Nov.2002-Mar.2003

. Sediment Detection in Medical Images using Edge Detection and Pattern

Recognition Algorithm

- Goal was to automatically detect and count specific shapes of

sediments from urine sample images. Various image enhancing, edge

detection, and pattern recognition algorithms were employed.

Implemented in Visual C++ under Windows environment.

Data Analyst, Hua Neng Power Plant Oct.2000-May.2001

. Prediction of Price on Electrical Power Market using Time Series

Analysis

- Goal was to predict electric price and analyze real time power

generation cost in order to assist Huaneng Power Plant, the major

energy company in China, to bid on the electrical power market.

The ARIMA time series model was used in electric price prediction.

A detail model of the entire power generation process was built

and live data streams from enterprise database were streamed into

the model to calculate generation cost. Finally the results were

aggregated in a cost-profit analysis and bidding assistance

system. Implemented in Visual Basic, C, SQL. Worked with a large

scale enterprise database in SQL Server 2000

Education

University of Massachusetts Amherst, MA

Master of Science in Mathematics and Statistics (G.P.A: 3.9/4.0)

- Thesis: Hierarchical Cluster Analysis in Micro-array

Experiments

Beihang University Beijing, China

Master of Engineering in Pattern Recognition and Artificial

Intelligence

- Thesis: Human Face Recognition Based on Support Vector Machine

Harbin Institute of Technology Harbin, China

Bachelor of Engineering in Computer Science(Distinct Graduate)



Contact this candidate