Post Job Free
Sign in

Data Engineer

Location:
Boston, MA
Salary:
80000
Posted:
June 19, 2025

Contact this candidate

Resume:

XIAOYANG FEI

+1-540-***-**** ************@*****.*** Boston, MA, USA linkedin.com/in/shawnfei/ PROFESSIONAL EXPERIENCE

Doggo Onboard Adventures LLC Boston, MA, USA

Data Engineer September 2023 - Present

• Led the design and implementation of a modular, event-driven data architecture with Airflow, AWS Lambda, and S3, automating travel data processing and powering 3 analytics products used in executive planning.

• Architected and governed a scalable data modeling layer using DBT and Redshift, turning fragmented data into reliable tables that powered self-serve dashboards across teams and sped up decision-making.

• Established a data quality assurance strategy using Great Expectations, integrated into GitHub Actions CI/CD pipelines and monitored through AWS CloudWatch, ensuring 100% reliability across production datasets and reducing time-to-detection of pipeline issues.

• Acted as a bridge between engineering and business, partnering with analytics and revenue teams to define key data domains, enabling segmentation strategies, demand forecasting, and optimization of underperforming cruise routes.

• Spearheaded the internal push for data observability and incident readiness, reducing pipeline downtime by 45% and enhancing stakeholder trust through transparency and accountability metrics. Doggo Onboard Adventures LLC Boston, MA, USA

Data Engineer Intern July 2023 - September 2023

• Wrote and optimized MySQL queries to clean and transform raw cruise and passenger data for reporting and analysis in Amazon Redshift.

• Consolidated three disparate data sources into a unified data lake using Python scripting, streamlining data access for the Business Intelligence team and reducing data silos across the organization.

• Validated the accuracy of DBT-generated analytics tables by developing a suite of 20+ automated SQL-based tests, reducing data discrepancies in customer segmentation by 15%.

• Configured 15+ Airflow DAGs and AWS Lambda functions for daily and weekly data processing, ensuring 99.99% data availability, and enabling real-time decision-making for stakeholders across three departments. Yiming Technology Beijing, China

Machine Learning Engineer Intern May 2023 - July 2023

• Developed a predictive model for the company’s coffee roaster, optimizing roasting parameters using real-time IoT sensor data to control temperature, wind speed, and other factors for optimal flavor, consistency, and quality.

• Increased coffee roasting consistency by 34% and cupping scores (evaluated by professional coffee tasters) by 26.7% compared to human operation by deploying Temporal Fusion Transformer models for real-time quality control.

• Deployed models on AWS (S3, EC2, RDS, SageMaker, Lambda) with CI/CD pipelines, enabling seamless retraining, efficient resource- management, and scalable deployment for continuous optimization and performance improvement.

• Forged collaborative relationships with engineers and product managers, crafting API designs and comprehensive documentation that- supported 10+ successful integrations within six months. Virginia Tech Transportation Institute Blacksburg, VA, USA Data Scientist January 2023 - May 2023

• Constructed an AI-driven driver emotion recognition system, decreasing false positives by 22% using advanced analysis correlating crash reports with driver affect and surroundings in real-time.

• Integrated CNN-LSTM architecture within a real-time emotion detection model, processing 30 frames per second with a latency of less than 100ms, enabling immediate feedback and intervention strategies for distracted drivers.

• Delivered data-backed presentation to Tesla using PowerBI, showcasing a 15% improvement in driver reaction time due to the model and leading to exploration of model integration in driver safety protocols. EDUCATION

Northeastern University September 2023 - May 2025

Master's, Data Science

Virginia Tech August 2019 - May 2023

Bachelor's, Data Science

CERTIFICATIONS

AWS Certified Data Engineer - Associate Google Cloud Professional Data Engineer SnowPro® Advanced: Data Engineer IBM Data Engineering Professional Certificate Databricks Certified Data Engineer Associate Terraform Associate MySQL 8.0 Database Admin Professional MySQL 8.0 Database Developer Oracle Certified Professional SKILLS

Languages & Frameworks: Python (Pandas, NumPy, Scikit-learn, TensorFlow), SQL (MySQL, PostgreSQL, Redshift), R, Node.js, Bash Data Engineering & Infrastructure: Airflow (ETL orchestration), DBT (data modeling & testing), Great Expectations, Redis, Pinecone, Data Lakes, Data Warehousing (Redshift, Snowflake), MongoDB, PostgreSQL Machine Learning & Modeling: Supervised & Unsupervised Learning, Clustering, Regression, Classification, A/B Testing, Time Series Forecasting, Real-time Inference, Sensor Fusion, Deep Learning (CNNs, LSTMs, Transformers) Cloud Platforms & MLOps: AWS (S3, EC2, RDS, SageMaker, Lambda, CloudWatch), Docker, API Deployment, GCP, Azure Visualization & Tools: PowerBI, Jupyter, Git, Postman, JIRA



Contact this candidate