Post Job Free

Resume

Sign in

Real world evidence Data Analyst

Location:
Newark, NJ
Posted:
October 22, 2023

Contact this candidate

Resume:

Lianlian (Jessie) Chen

**** * *** ******, ********, NJ, 07029

929-***-**** ad0j6z@r.postjobfree.com LinkedIn

EDUCATION

New York University New York, NY

Master of Science in Biostatistics Dec 2022

• GPA: 3.6/4.0

• Relevant Coursework: Analysis & Reporting, Biostatistics, Epidemiology, Machine Learning, Longitudinal Analysis, Survey Design, Probability and Statistics, Research Methods, Psychometric Analysis Brandeis University Boston, MA

Bachelor of Arts in Psychology May 2019

• GPA: 3.5/4.0

SKILLS

• Programming: Python, R, MATLAB, SQL, SAS, SPSS, STATA

• Database & Platforms: Microsoft SQL Server, Azure SQL Database, Azure Synapse Workspace, Google BigQuery

• Visualization: Tableau, Microsoft Power BI, Shiny, ggplot2, tmap in R, matplotlib and Seaborn in Python

• ML Frameworks: PySpark, Scikit-Learn, Pandas, NumPy

• Other Skills: Microsoft Excel, Microsoft PowerPoint, Professional Writing

• Languages: English, French, Mandarin, Cantonese, Japanese WORK EXPERIENCE

Lumanity New York, NY

Real World Evidence Data Analyst Apr 2023 – Present

• Collaborated with the PHARMO institute to modernize legacy pharmacy data into the FHIR model, leveraging Azure Data Factory. Validation checks were conducted using Postman

• Conducted data science investigations and research projects on original pharmacy data. Utilized SQL to calculate the percentage of drug usage durations across various conditions. Enhanced this percentage by 30% through adjustments to the threshold, margin of difference in the original formula, and ATC codes

• Developed a procedure for creating a cohort study on Fibromyalgia in SQL Server. Established a workflow script in Azure Synapse Workspace using SparkR and Azure SQL Database

• Employed the “splink” package within Azure Synapse Workspace to perform record linkage and deduplicate pharmacy data. Utilized PySpark to execute logistic regression, random forest, and XGBoost to assess model accuracy J&J Hisoftware Inc Atlanta, GA

Analyst Intern Jun 2022 – Aug 2022

• Created a comprehensive Neural Cognitive Diagnosis framework to assess students’ performance on test scores

• Implemented aforementioned deep learning model in Python by incorporating both student related and exercise related factors through multiple neural layers

• Successfully deployed the neural network model to predict students’ knowledge scores. Demonstrated the effectiveness of this framework by establishing its accuracy and interpretability NYU Langone New York, NY

Data Analyst Intern Jan 2022 – May 2022

• Designed a study to investigate the association between Hidradenitis Suppurativa (HS) and metabolic diseases. Data was collected from the Epic COSMO system, and odds ratios and 95% confidence intervals were calculated

• Authored reports on the development of Magnetic Resonance (MR) biomarkers for Neurodegeneration and the prevalence of HS using Machine Learning Analysis

• Explored the application of MR biomarkers in Alzheimer's patients and performed a comprehensive analysis using statistical methods, including Pearson correlation, jackknife analysis, linear regression, subgroup analysis, and one-way ANCOVA

• Applied multiple imputation with mice package in R to refill missing data and evaluate the result by correlation analysis IQVIA Shanghai, CN

Consultant Intern May 2021 – Sept 2021

• Initiated an insurance coverage plan with clients to facilitate their application for local government permits and incentive plans, ultimately securing final approval

• Created a corporate database using Excel's PivotTable feature. This involved generating a frequency table for over 300 insurance companies from raw data

• Organized a catalog of anticarcinogenic drugs for clients and evaluated the average annual costs for various drugs

RESEARCH PROJECTS

Machine Learning on Heart Attach Project New York, NY Research Assistant Jan 2022 – May 2022

• Quantified a dataset related to cardiovascular health to demonstrate that health-related features such as BMI, cholesterol

(CHOL), and thalassemia (THAL) are viable predictors of heart attacks

• Modeled performance by the size of the error, LOOCV and 5-fold CV values with forward/backward and AIC/BIC methods to select significant predictors from the best performance model



Contact this candidate