Resume

Data Analyst

Location:

Alameda, CA

Posted:

March 26, 2021

Contact this candidate

Resume:

BENJAMIN NGUYEN

**** **** ****** • Berkeley, CA, 94704 • 714-***-**** • adk7h2@r.postjobfree.com

Education

UNIVERSITY OF CALIFORNIA, BERKELEY May 2020

B.A. Data Science

Work Experience

CLINICAL DATA ANALYST — PRINCIPIA, A SANOFI COMPANY June 2020 — Present

• Acting chief engineer of data pipelines, algorithms, and visualizations for Data Management and Clinical Research Development at Sanofi’s South San Francisco branch.

• Managed clinical trials by developing a resource allocation model which determines number of full-time employees that should be assigned to concurrent and new studies, while predicting durations of potential future studies.

• Led development on programming code that automatically generates Patient Profiles, generalizing code infrastructure to run efficiently across any study.

DATA SCIENTIST — CORNERSTONE AI March 2021 — Present

• Act as a consultant to identify, triage, and solve data science problems regarding ETL and app- deployment, utilizing custom JavaScript and complex SQL queries.

• Developing cross-functional, multi-page apps with interactive data graphs via Dash & Plotly through Python, SQL, HTML, CSS, JavaScript.

BIO-STATISTICS INTERN — PRINCIPIA BIOPHARMA June 2019 — August 2019

• Created a pipeline to automate and solve data-management problems regarding new patient data coming in every week for the company’s three largest studies.

• Pioneered a standardized, automated system using Python and R to generate individual Patient Profiles for Pemphigus patients treated with Rilzabrutinib.

• Successfully pitched the data-visualizing platform Tableau to Principia by doing a Country- analysis of two major studies — finding significant correlations in both and putting Tableau into the budget at a larger scale.

Project Work

ANALYSIS OF PRESIDENT TRUMP’S TWITTER TWEETS

• Manipulated Twitter API Data on President Trump to draw self-directed conclusions based on an NLP sentiment analysis of controversial tweets.

YELP RATING PREDICTIONS with DEEP NEURAL NETWORKS

• Implemented multiple Deep Neural Network Models (RNN, LSTM, BERT) to attain optimal efficacy in predicting Yelp Ratings (>83% efficacy).

NEW YORK TAXI-RIDE REGRESSION MODEL AND EDA

• Built regression model using a processing pipeline with Haversine distances and other features to predict duration of taxi rides in New York with a mean absolute error of under 300 seconds.

• Utilized SQL for data querying and cleaning, Seaborn for complex data visualization, and Sci-Kit Learn to complete the regression model with feature engineering, cross-validation, and Tikhonov regularization.

Relevant Skills

Languages: Python, SQL, R, Java, JavaScript, Spark, Scheme Libraries: Tensorflow, Pytorch, Pandas, Scipy, Numpy, Matlab, Sci-kit Learn, Seaborn, ggplot2 Skills: Tableau, Advanced Jupyter Notebooks, Interactive Data Visualization, Data Management, Machine Learning, Neural Networks, NLP, Algorithms, Databases, Sampling, Statistical Analyses

Contact this candidate