Post Job Free
Sign in

Data Scientist Analysis

Location:
Cordova, TN
Posted:
June 12, 2024

Contact this candidate

Resume:

Xiaojun Sun

Phone: 901-***-**** Email: *******@*****.*** Linkedin: /in/xiao-jun-sun/

SUMMARY

Data Scientist with 4+ years of working experience in data management and machine learning modeling in the Healthcare industry. Excellent communication skills and good team player. SKILLS

Python, NumPy, Pandas, Matplotlib, Seaborn, Scikit-learn, R, SQL, Linux, Git, GCP, Tableau, Neural Network, CNN, ANN, Random Forest, XGBoost, SVM, PCA, Time Series Forecasting, Clustering, Regression EXPERIENCE

St. Jude Children's Research Hospital Memphis, TN

Data Scientist 01/2018 - 03/2023

Stem Cell Detector:

To help blood cell researchers in the Hematology department identify blood stem cells and accelerate new drug development, I developed a Blood Stem Cell Detector based on classification models. I collaborated with the research team at St. Jude Children's Research Hospital to collect data from the animal models. I trained and optimized multiple models, including Logistic Regression, SVM, Random Forests and XGBoost.

I found that XGBoost was the best-performing model, achieving a precision score of 75%, which was a 50% improvement over the traditional method. This resulted in cost savings of $2 million for the research facility.

Age Predictor:

To help cancer survival patients measure the age and monitor healthy versus unhealthy aging and disease risk, I developed DNA Methylation Age Predictor based on a linear regression model. I collected DNA Methylation Sequencing data from St. Jude cancer survival patients and established a multilinear regression model for DNAm that accurately predicts chronological age. The model only needs 10% of DNAm sequence compared to the traditional model, saving 90% of sequence costs and more than 90% of precious clinical samples. University of Oklahoma Health Sciences Center Oklahoma City, OK Research Associate Data Analyst 12/2015 - 01/2018

Cancer Gene Analysis:

To study the gene mutation profiles and correlation coefficient between the mutation gene and blood cancer, I identified a target gene, SHP2, and published a paper in Leukemia. PROJECTS

Churn Predictive Model 01/2023 - 06/2023

Optimized classification models to identify users who was likely to cancel their subscriptions. Developed marketing analytics platforms by optimizing several classification models (Logistic Regression, Random forest, and XGBoost) to deliver business insights and actionable solutions to reduce churn. My prediction accuracy was 90%.

EDUCATION

Tsinghua University Beijing, China

Ph.D. in Biomedical Science 09/2009 - 01/2014



Contact this candidate