XIAODONG LIAN, PH.D.
*** ***** ******* ****, *** #228, Lafayette, LA 70503
Mobile: 337-***-****
Email: ********@*****.***
SUMMARY of QUALIFICATIONS
ADVANCED STATISTICAL SKILLS
• SAS Base certification.
• Solid theory foundation in probability, statistics and applied mathematics.
• Proficiency in various multivariate analytics including PCA, Factor Analysis and Cluster
Analysis.
• In-depth knowledge in ANOVA, MANOVA and experiment design techniques.
• Proficiency in statistical computations, simulations, and Monte Carlo approach.
• Proficiency in logistic regression and linear regression.
PROJECTS CONDUCTED USING SOFTWARE SAS (more details, see Appendix)
• Comparison of two mean vectors A study of the Kite case
• Linear discriminant analysis of breakfast cereals
• Cluster analysis of breakfast cereals
• Canonical correlation analysis of the diabetes case
• A study of the relationship between muscle mass and age
• Model selection A study of the electricity load
• Regression models with three measured explanatory variables A study of Cheddar Cheese
COMPUTER & PROGRAMMING SKILLS
• Expertise in statistical applications/analytical tools such as SAS, SQL, Minitab, R, Matlab,
and MS Excel.
• Proficiency in programming languages: Visual Fortran 90/95, IMSL.
• Proficiency in text formatting packages: LaTex.
• Strong PC skills with knowledge of Microsoft suite of products including MS Word, MS Excel, MS PowerPoint, and various Internet/Networking applications.
• Proficiency in operating systems: Windows and Unix.
WORKING STYLE
• Goal-oriented, highly motivated self-starter with a strong orientation to customer service.
• Exceptional analytical and problem solving skills.
• Confident, analytical in nature, fast learner and hard-working, strong team player.
• Effective interpersonal and written/verbal communication skills.
PROFESSIONAL EXPERIENCES
University of Louisiana at Lafayette, Louisiana, USA
TA instructor of Department of Mathematics (Spring 2006 - Present)
• Taught undergraduate level (from freshman to senior) courses of mathematics and statistics.
Teaching Assistant of Department of Mathematics (August 2005 - December 2005)
• Graded assignments and tests for the course of Ordinary Differential Equations.
University of Kentucky, Lexington, Kentucky, USA
Teaching Assistant of Center for Manufacturing (August 2003 - December 2004)
• Taught lab sessions and graded assignments, reports, and test papers.
T D & M Precision Tools Company, Singapore.
Mechanical Engineer (July 1997 – January 2001)
• Design machining processes for plastic mold components.
• Design electrodes and program for computer-based machines.
Tianjin Locomotive and Rolling Stock Machinery Works, China.
Mechanical Engineer (July 1992 - June 1997)
• Design and optimize machining processes for turbochargers.
• Program for computer-based machines.
EDUCATION
• Ph.D. in Statistics, University of Louisiana at Lafayette, LA, August 2011
Dissertation: Tolerance Intervals for Some Linear Models;
Advisor: Dr. K. Krishnamoorthy
• M.S. in Statistics, University of Louisiana at Lafayette, LA, December 2008
• M.S. in Manufacturing Systems, University of Kentucky, KY, December 2004
• B.S. in Mechanical Engineering, Southwest Jiaotong University, China, July 1992
PUBLICATIONS
• “Tolerance Intervals for the Distribution of the Difference between Two Independent Normal Random Variables.” Communications in Statistics – Theory and Methods, 40, 117-129.
Co-authors: K. Krishnamoorthy and Sumona Mondal.
• “Closed-Form Approximate Tolerance Intervals for Some General Linear Models and Comparison Studies.” To appear in Journal of Statistical Computation and Simulation. (2011).
Co-author: K. Krishnamoorthy.
REFERENCES References are available upon request.
APPENDIX
Comparisons of two mean vectors A study of the Kite case
The tail length (in mm) and the wing length (in mm) were measured for samples of 45 female hook-billed kites and 45 male hook-billed kites. We tested the hypothesis of equal population mean vectors using two samples Hotelling test and found that at the usual significance levels there was sufficient evidence to conclude that the two gender population mean vectors were significantly different. Simultaneous 95% confidence intervals for tail length and wing length of the difference in the population mean vectors were constructed.
Linear discriminant analysis of breakfast cereals
In this project, we conducted a linear discriminant analysis of 43 brands of breakfast cereals, which are produced by three different American manufacturers: General Mills (G), Kelllogg (K), and Quaker (Q). This analysis was based on the eight characteristics: calories, protein, fat, sodium, fiber, carbohydrates, sugar, and potassium. The allocation rule is developed based on the scheme, which is to minimize the total probability of misclassification. A new brand of breakfast cereal may be classified by using this rule. Also, we found that based on the discriminant function, it did not exhibit that some manufacturers were associated with more “nutritional” cereals (high protein, low fat, high fiber, and low sugar).
Cluster analysis of breakfast cereals
Cluster analysis is a classification method that is used to arrange a set of cases into clusters. In this project, we made a cluster analysis of breakfast cereals by applying single linkage (minimum distance), complete linkage (maximum distance), average linkage (average distance), and centroid clustering methods. A nature grouping of 43 breakfast cereals were clustered into a reasonably small number of groups on the basis of the eight measured characteristics: calories, protein, fat, sodium, fiber, carbohydrates, sugar, and potassium. The data of breakfast cereals is taken from Johnson and Wichern, Applied Multivariate Statistical Analysis. Our conclusions were addressed as follows. First of all, single linkage, average linkage, complete linkage, and centroid clustering methods exhibit the almost same sub-clusters in the first several stages. In the final two clusters, there exists variation between the complete linkage and the other three methods above. Secondly, the intermediate clusters vary among the four clustering methods above. Thirdly, common values (ties) in the distances can produce multiple solutions to a hierarchical clustering problem.
Canonical correlation analysis of the diabetes case
Canonical correlation analysis is a variation on the concept of multiple regression and correlation analysis, that is, canonical correlation analysis focuses on the correlation between a linear combination of the variable in one set and a linear combination of the variable in another set. In this project, we focused on canonical correlation analysis of diabetes case. The two baseline variables are “relative weight of the person” and “person’s fasting glucose level”, and the three laboratory variables are “glucose area”, “insulin area”, and “steady glucose level.” Also there are three groups of people: “normal”, “chemical diabetics”, and “overt diabetic”. We treated these three groups separately and applied canonical correlation analysis to explore the correlations between the baseline variables and the laboratory variables.
A study of the relationship between muscle mass and age
The scatter plot suggests that the relation between muscle mass and age may be modeled by linear regression line. We built a linear regression model to predict a person’s muscle mass given age.
Model selection A study of the electricity load
Kentucky Utilities Company wanted to formulate a model to predict daily electricity load based on average daily temperature. We assume that the rate of change in the electricity load would be constant in terms of the temperature. The overall electricity load might vary from weekdays to weekends. We concluded that the reduced model with the three parallel lines (weekdays, Saturdays, and Sundays) to predict daily electricity load based on the average temperature.
Regression models with three measured explanatory variables A study of Cheddar Cheese
The main goal of this study was to study the dependence of the response variable ‘taste’ on the three explanatory variables, ‘acetic’, ‘H2S’ (Hydrogen Sulfide), and ‘lactic’. Based on the hypothesis tests and analysis of confidence intervals and residuals, H2S and lactic are necessary for the regression model, and acetic is not needed.