Post Job Free

Resume

Sign in

Data Analyst

Location:
Rochester, NY
Posted:
November 03, 2020

Contact this candidate

Resume:

Trang Nguyen 616-***-**** https://trangun.com/ https://github.com/trangun adhiux@r.postjobfree.com

Education

Master of Science in Data Science, University of Rochester, Rochester, NY Expected December 2020

-Interests: Statistics, Aritificial Intelligence, Machine Learning

Bachelor of Science in Statistics, Grand Valley State University (GVSU), Allendale, MI April 2019

Minors in Data Science and Computer Science

-Mu Sigma Rho - National Honorary Society

-Member of American Statistical Association

Work Experience

Data Analyst, School of Nursing, University of Rochester, Rochester, NY Oct 2019 – Present

-Perform data analysis task such as analyze the incomplete applicants dataset to see if there are any patterns why these applicants never submitted the application.

-Collect and clean addmission data from multiple colleges in the U.S.

Data Science Intern, System (Stealth Mode), New York City, NY June 2020 – September 2020

-Worked on medical and public health data and performed exploratory data analysis to explain the relationship among between attributes .

-Built and trained variety of machine learning models for the appropriate dataset such as Random Forest, Lasso Regression, and Linear Regression to address complex global issues (COVID-19).

-Worked in an agile environment and gained experience with Jira

Machine Learning Intern, STEMAway, Remote June 2020 – July 2020

-Created a web crawler using Scrapy to extract data from Forums and performed Data Cleaning and Analysis using packages Pandas, matplotlib, seaborn, etc.

-Developed a forum classifier using BERT Natural Language Processing (NLP) model to predict the forum for any given post.

Math and Stats Tutor, Math and Stats Center, GVSU August 2018 – April 2019

-Assisted students in such areas as Intermediated Algebra, Calculus I & II, Applied Statistics and created tailored lesson plans and study guides on subject matter.

-Collaborated with students to complete homework assignments, identified lagging skills and improved weaknesses.

Coursework and Projects

Machine Learning, Machine Vision, Intermediate Statistical Method, Time Series, Artificial Intelligence, Computing in Statistics, Design of Experiments, Discrete Structure, Data Mining, Databases, Data Structures and Algorithms, Computer Science I & II

Action Actor Classification (Python) Spring 2020

-Built deep learning models to predict classes of actor and action in each frame on A2D dataset contains 3782 videos from YouTube on Google Cloud Platform

-Used pretrained model Resnet 152, froze selective layers, and stepwise learning rate

-Python libraries: torch, torchvision, utils, argparse, os

Time Series Project (Python) Fall 2019

-Developed forecasting model using Seasonal ARIMA that can accurately predict trends for future years utilizing 7 years of monthly data on airline miles flown in United Kingdom.

-Python libraries: statsmodels, pandas, numpy, matplotlib

Investigating Effective Techniques on Shared Economy Lodging in NYC (Python) Fall 2019

-Developed model to determine popular keywords for comments and titles on Airbnb data using n-grams models and FP-growth.

-Python libraries: spacy, pyfpgrowth, genism, nltk, pandas

Rapids Bus System (SQL) Fall 2018

-In a team of 4 designed bus system for Grand Rapids, MI using ER Diagram and relational schema.

-Used SQL DDL/DML to verify and test the designed system.

Clean the Fatal Accident Reporting System Data (SAS) Fall 2017

-Combined, transformed, formatted data sets. Removed any replicated and useless data. Created tables and charts based on cleaned data by using SAS program.

Competitions

University of Rochester Biomedical Data Science Hackathon Summer 2020

-First Place Winner in Open Division.

-Worked as a group of 4, use Python to clean and predict the severity score of disease with data from a prospertive multi-year clinical translational study using Multivariate Imputation by Chained Equations (MICE) and Lasso Regression.

Data Fest 2019, Statistics Department, GVSU March 2019

-Competed as a group of 5. Used R to clean and analyze large dataset provided by Canada Women’s National Rugby Sevens Team which contained daily health records from team members.

-Implemented and developed Principle Component Analysis and Factorial Analysis to estimate win/lose rate based on predictors.

Data Fest 2018, Statistics Department, GVSU March 2018

-Recipient of Best Visualization Award.

-Competed as a group of 5, cleaned and analyzed large dataset provided by Indeed that contained job information by U.S. states from 2015 to 2016.

-With data and SAS, determined which jobs were most in demand by state and season of year and developed and presented 8 choropleth maps of U.S. of most in-demand jobs by state and season.

Skills

-Programing Language: Python, R, SQL, Java, C++, SAS, Databricks Spark, SPSS

-Framework: Pandas, numpy, SciPy, sklearn, matplotlib, plotly, seaborn, pytorch, tensorflow, etc.

-Mathematical Models: Linear Regression, Logistic Regression, Random Forests, Statistics Analysis, etc.

-Platform: Google Cloud, Microsoft Azure, Amazon Web Services, Jupyter Lab, Google Colab



Contact this candidate