Post Job Free

Resume

Sign in

Data Science Engineer

Location:
Herndon, VA
Posted:
February 27, 2021

Contact this candidate

Resume:

YU XIAO

adki9x@r.postjobfree.com 858-***-****

Herndon, VA 20171

SUMMARY

A results-driven and analytical Data Science Engineer who thinks “outside of the box”. Avid Python developer. Passionate about utilizing machine learning algorithms to solve real life problems. EDUCATION

Georgetown University September 2017 - May 2019

Master of Science, Computer Science

Relevant Coursework: Neural Nets and Deep Learning, Text Mining & Analysis, NLP for Data Analytics, Empirical Methods in NLP, Massive Data Foundation, Web Search and Sense Making, Streaming Algorithm University of California, San Diego September 2013 - March 2017 Bachelor of Science, Mathematics-Computer Science Relevant Coursework: Advanced Data Structures, Design & Analysis of Algorithm, Theory of Computation PROFESSIONAL EXPERIENCE

Hitachi Vantara, Herndon, Virginia

Data Science Engineer, June 2019 – Present

• Developed a Computer Vision solution that automates defect detections for sewer pipeline inspection videos through POC phase to commercialization phase.

• Designed and implemented customized evaluation metrics to analyze the performance of various models such as YOLO, which helped to successfully determine the optimal model for commercial products.

• Built binary classification models and multiclass classification models to reduce false positive rates and increase overall prediction accuracy by 10%.

• Managed a team of five data annotators to prepare over 20,000 raw videos. Built a data pipeline by setting up the Computer Vision Annotation Tool (CVAT) on AWS EC2 which largely increased team productivity. Extracted data and generated visualization for internal review and client demo.

• Implemented various product features in Python, such as creating an interactive Swagger API console to initiate model inference pipeline, extracting timestamps from videos using AWS Textract and smoothing data using isotonic regression, implementing business rules proposed by customers and generating customer friendly reports, and automating download and uploading of results to AWS and Azure. REAN Cloud, Herndon, Virginia

Data Engineer Intern, May 2018- August 2018

• Developed a POC project that classifies chest X-ray images with 14 labels. Used Keras to build a CNN model for multiclass classification that reached state-of-the-art AUROC scores. This project was ultimately selected as one of showcases that demonstrates the machine learning capability of the team.

• Implemented a new product feature that classifies the binary sentiment orientation of tweets. Collected tweets from the Twitter Firehose API and preprocessed data. TECHNICAL SKILLS

• Programming Languages: Python, Java, SQL, Bash

• Machine Learning Frameworks and Libraries: Pandas, Scikit-learn, NumPy, OpenCV, Keras

• Data Visualization: Tableau, Plotly, Amazon QuickSight

• Cloud Services: AWS (S3, EC2, SageMaker, Textract, Lambda), Azure (Blob Storage, Computer Vision API)



Contact this candidate