YU XIAO
adki9x@r.postjobfree.com 858-***-****
Herndon, VA 20171
SUMMARY
A results-driven and analytical Data Science Engineer who thinks “outside of the box”. Avid Python developer. Passionate about utilizing machine learning algorithms to solve real life problems. EDUCATION
Georgetown University September 2017 - May 2019
Master of Science, Computer Science
Relevant Coursework: Neural Nets and Deep Learning, Text Mining & Analysis, NLP for Data Analytics, Empirical Methods in NLP, Massive Data Foundation, Web Search and Sense Making, Streaming Algorithm University of California, San Diego September 2013 - March 2017 Bachelor of Science, Mathematics-Computer Science Relevant Coursework: Advanced Data Structures, Design & Analysis of Algorithm, Theory of Computation PROFESSIONAL EXPERIENCE
Hitachi Vantara, Herndon, Virginia
Data Science Engineer, June 2019 – Present
• Developed a Computer Vision solution that automates defect detections for sewer pipeline inspection videos through POC phase to commercialization phase.
• Designed and implemented customized evaluation metrics to analyze the performance of various models such as YOLO, which helped to successfully determine the optimal model for commercial products.
• Built binary classification models and multiclass classification models to reduce false positive rates and increase overall prediction accuracy by 10%.
• Managed a team of five data annotators to prepare over 20,000 raw videos. Built a data pipeline by setting up the Computer Vision Annotation Tool (CVAT) on AWS EC2 which largely increased team productivity. Extracted data and generated visualization for internal review and client demo.
• Implemented various product features in Python, such as creating an interactive Swagger API console to initiate model inference pipeline, extracting timestamps from videos using AWS Textract and smoothing data using isotonic regression, implementing business rules proposed by customers and generating customer friendly reports, and automating download and uploading of results to AWS and Azure. REAN Cloud, Herndon, Virginia
Data Engineer Intern, May 2018- August 2018
• Developed a POC project that classifies chest X-ray images with 14 labels. Used Keras to build a CNN model for multiclass classification that reached state-of-the-art AUROC scores. This project was ultimately selected as one of showcases that demonstrates the machine learning capability of the team.
• Implemented a new product feature that classifies the binary sentiment orientation of tweets. Collected tweets from the Twitter Firehose API and preprocessed data. TECHNICAL SKILLS
• Programming Languages: Python, Java, SQL, Bash
• Machine Learning Frameworks and Libraries: Pandas, Scikit-learn, NumPy, OpenCV, Keras
• Data Visualization: Tableau, Plotly, Amazon QuickSight
• Cloud Services: AWS (S3, EC2, SageMaker, Textract, Lambda), Azure (Blob Storage, Computer Vision API)