Hongxia Dai, Ph.D.
[ adb74r@r.postjobfree.com Ó 540-***-**** Los Angeles, CA linkedin.com/in/hongxia-dai github.com/daihongxia EDUCATION
Ph.D. in Physics
Virginia Tech
Dec 2019 Blacksburg, VA
B.Sc. in Physics
University of Science and
Technology of China (USTC)
Jun 2014 Hefei, China
SKILLS
Python SQL C++ R
Bash Git Linux
Object-oriented Programming
Pandas Apache Spark
PyTorch Scikit-Learn
TensorFlow Keras
Computer Vision
Natural Language Processing
Data Visualization
Statistical Modeling
Signal Processing
Monte Carlo Simulation
Google Cloud Platform (GCP)
WORK EXPERIENCE
Research Assistant
Center for Neutrino Physics, Virginia Tech
May 2015 – Oct 2019 Blacksburg, VA
Published 6 papers in total on leading particle physics journals, and presented on behalf of the collaboration on international conferences. See my Google Scholar: bit.ly/3a82aa5
Analyzed experimental data to extract nuclear cross-section for different atoms to model atomic nucleus structure accurately;
Developed Monte Carlo simulation programs in C++ to predict the efficiency of particle detectors;
Mentored junior PhD researchers within the team. Visiting Researcher
Thomas Jefferson National Accelerator Facility
Feb 2017 - May 2017 Newport News, VA
Participated in the experimental setup and took shifts in monitoring data collection, closely worked with hardware engineers in detecting and reporting potential issues in the facility;
Developed and implemented data processing workflow to analyze and visualize over 10 TBs of raw data in Python and ROOT (C++ data analysis framework).
CERTIFICATES
Deep Learning Specialization
deeplearning.ai, Coursera
CNN ResNets YOLO
RNN LSTM Word2Vec
Optimization And More
Advanced Data Science with
IBM Specialization
IBM, Coursera
Spark Pipeline SystemML
DeepLearning4j And More
PROJECTS
Object Detection For Self-driving Car
Researched and trained CenterNet (CNN-based model) using PyTorch on Google Cloud (GPU instances, cloud storage, etc) for object detection and pose prediction on vehicles in photos;
Wrote modular and portable python pipeline for data preprocessing, training, and evaluating in computer vision tasks. Judgements Legal Area Classification
Built long document encoders with Tf-Idf, LSTM, BERT, etc.. Trained classifiers using Scikit-Learn, and Huggingface’s Transformers. Achieved classification accuracy of 0.68 for 41 different classes;
Packaged the project with PyScaffold.
311 NYC Service Requests Data Analysis And Web App
Analysed 311 requests dataset (12GB+) stored in AWS Redshift with SQL queries and Pandas. Data visualization with Matplotlib, Seaborn, and Bokeh. Used GLoVe encoder to process texts, and clustered similar texts using K-Means & DBSCAN;
Built models to predict daily requests (regression) and types (50-class classification) using LightGBM, Logistic Regression, and Decision Tree, achieved 7.7 l1-loss and 0.63 accuracy respectively.
Built a web app for 311 service daily requests prediction, using Flask for back-end and React for front-end.