Post Job Free

Resume

Sign in

Data Analyst Medical

Location:
Dallas, TX
Posted:
March 07, 2020

Contact this candidate

Resume:

Guocun Huang

Statistician/Data Analyst

Contact

**** ***** **** #***

Dallas, TX, 75205

469-***-****

adb6v5@r.postjobfree.com

American Citizen

Professional Summary

Graduate in Data Science with over 15 years of extensive research experience using various statistical methods to perform data modeling and simulation so as to provide recommendation and justification to improve efficiency.

Skills

• R, SAS, SQL, Scala, Python

MATLAB, C++, Java

• Advanced statistical analysis,

modeling, simulation

• Experimental design, Spatial

data analysis

• Regression models, Multivariate

and Bayesian data analysis,

Deep learning models

• Big data, Hadoop, Spark

• Prism8 (like Tableau, PowerBI)

• Photoshop CS6

• Microsoft Excel

• Microsoft PowerPoint

Work Experience

University of Texas Southwestern Medical Center Dallas TX Data Research Scientist 06/2004 – 02/2020

• Designed and performed experiments, collected and analyzed data with statistical skills in collaboration with team members

• Developed hypothesis and models to explain data and wrote manuscripts

• Communicated with journal editors to published experiment reports

• Coached and mentored graduates to conduct experiments Michigan State University East Lansing, MI

Data Analyst 09/2001 – 05/2004

• Constructed plasmids and performed DNA transformation

• Collected data and wrote report to the principal investigator Education

University of Texas at Dallas

Richardson, TX

Master of Science in Statistics

(Concentration: Data Science)

Aug 2018 – May 2020

University of Chinese Academy

of Sciences

Beijing, China

Ph.D in Molecular Biology

Sep 1996 –Jun 2001

Projects

Statistical Method (SAS) 08/2019 – 12/2019

• Given dataset containing data on 90 independent river water samples with several variables including season, river size, fluid velocity, chemical compositions and algae abundance, carried out appropriate tests including the appropriate hypothesis, test statistic value, p-value and conclusion.

• Implemented the Monte Carlo simulation to estimate coverage probability of standard 95% confidence interval for proportion for n=25, 50 and 100.

• Given a dependent Y variable and several independent X variables, fitted a linear regression model and carried out regression diagnostics. Discovered the best model using adjusted R2 criterion and stepwise selection method.

• Fitted a weighted least squares regression model and checked if iterating the process of estimating weights improves the estimates.

• Using Bisquare weight function and MAD (for calculation of scaled residuals) approach to provide robust regression for dampening the influence of outlying cases

Multivariate Analysis (R Language) 08/2019 – 12/2019

• Performed linear discriminant function analysis to classify future application based on GPA and GMAT scores using.

• Calculated the plug-in (APER) and leave-one-out (AER) estimates of misclassification rate.

• Performed a factor analysis with varimax rotation using a given correlation matrix, uncovered the residual matrix for the predicted model.

• Practiced a principal components analysis and compared it with the factor analysis.

• Used kmeans approach and clustering (with complete linkage) approach to analyze the life expectancy of people from different countries.

• Constructed a 95% confidence region (ellipsoid) for difference of means between two groups. Constructed Bonferroni 95% simultaneous confidence intervals for each individual means difference.

• Calculated mahalanobis distances and reported any extreme distances that may represent multivariate outliers.

Machine Learning (Python) 05/2019 – 08/2019

• Selected the best parameter for deep learning with Tensorflow to increase learning efficiency.

• Executed the AdaBoosting algorithm to find the weak classifier, the error, the weight of the weak classifier, the probabilities normalization factor, the probabilities after normalization, the boosted classifier, the error of the boosted classifiers and the bound.

• Performed decision tree.

Database Foundations for Business Analytics (SQL) 01/2019 – 05/2019

• Used nested queries or sub-queries to list requested results. Medical Research (Photoshop CS6 and Matlab) 06/2004 – 02/2020

• Built a method to measure the oxidation-reduction level change in mammalian cells by expressing a NADH-sensor protein from bacteria.

• Revealed a protein kinase A (PKA) is involved in regulation of the circadian clock.

• Increased the efficiency of making knock-in and knock-out transgenic mice by CRISPR/Cas9.

• Demonstrated that the singularity mechanism exists in the circadian regulation.

Conference Presentations

• A transcriptional metabolic sensor for studying the dynamics of NADH/NAD+ redox homeostasis in mammalian cells. The Society for Research on Biological Rhythms (SRBR), June 14-18, 2014, Big Sky, Montana

• CNOT1 promotes phosphorylation of mammalian clock proteins via PKA. The Society for Research on Biological Rhythms (SRBR), May 21-25, 2016, Palm Harbor, Florida.

Publications

• More than 10 original papers involving in model building, data analysis and explanation have been published in academic journals as a corresponding, or first or co-author author. Details can be provided upon request. References

Available on request



Contact this candidate