Rick Benson
Data Engineering Strategy Chief Architect, Data Engineering
rickbenson.code gmail.com New Jersey, US Github LinkedIn
Summary
Data Engineer with 9 years of experience building scalable data platforms, cloud-native analytics pipelines, and
real-time streaming systems across fintech, healthcare, retail, and logistics. Skilled in leading global teams and
delivering cost-efficient, high-performance, and compliant data ecosystems. Expertise in AWS, Azure, and GCP
with hands-on work in Apache Spark, Flink, Kafka, dbt, Delta Lake, and Iceberg. Strong in data warehousing, data
mesh, and automation. Focused on secure, reliable, high-quality solutions that drive business impact.
Professional Experience
Lead Data Engineer, Self Employed 04/2022 Present
Built and scaled real-time streaming pipelines on AWS and GCP with Kafka, Pub Sub,
Kinesis, Flink, Beam, Kafka Streams, and Spark Structured Streaming to support
fraud detection and personalization with 99.99% uptime.
Designed and deployed a data mesh spanning six domains with Avro, Protobuf,
Schema Registry, and Confluent Kafka, integrating observability with DataDog,
OpenTelemetry, Great Expectations, Soda Core, and Monte Carlo.
Created modular dbt frameworks in Python and SQL, integrated Polars for high-
performance transformations, and automated delivery with GitLab CI CD, GitHub
Actions, and Jenkins.
Orchestrated Spark, PySpark, and Flink workloads on Kubernetes using Karpenter,
Helm, and Argo Workflows, with autoscaling and cost-aware scheduling that
reduced compute by 38 percent.
Championed Iceberg, Delta Lake, and Hudi for lakehouse governance, schema
evolution, and ACID analytics on S3, GCS, and BigQuery.
Standardized Iac with Terraform, Pulumi, AWS CDK, Ansible, and Docker for Airflow,
Prefect, and dbt, adopted by 12+ teams.
Advanced monitoring and ML-driven anomaly detection deployed with Prometheus,
Grafana, and Data fold, ensuring 100 percent SLA compliance.
Mentored 25 engineers across four pods, leading code reviews, performance tuning,
and architectural planning for large-scale systems in Python, Scala, Java, and Go.
Senior Data Engineer, High Tech Labs 04/2020 04/2022
Built multi-cloud pipelines on AWS Glue, Lambda, Redshift, and S3, plus GCP
Dataflow and BigQuery, processing 1.2 TB daily.
Architected secure Delta Lake frameworks with schema enforcement, GDPR CCPA
data masking, tokenization, and audit-ready datasets.
Automated ETL pipelines with Glue and Python, integrating lineage tracking and
encryption at rest and in transit.
Provisioned infrastructure with AWS CDK and Terraform, improving speed of
deployments across environments.
Built and maintained 50+ validation suites in Great Expectations, Soda Core, and
Monte Carlo, reducing data quality incidents by 80 percent.
Introduced Unity Catalog and Lake Formation for fine-grained access control, IAM,
and governance.
Mentored junior engineers, led design sessions, and collaborated in Agile teams to
deliver secure and compliant analytics solutions.
Data Engineer, Expedia 08/2018 03/2020
Refactored and scaled ETL pipelines with Python, SQL, and Airflow to ingest
transactional and event-driven data into PostgreSQL and data warehouse systems.
Migrated workflows to cloud environments with EMR, Redshift, and Snowflake,
improving scalability and cost efficiency.
Integrated APIs, CSV, JSON, and XML data sources into centralized datasets with
Airflow, cron, and custom Python logic.
Delivered Tableau and Looker dashboards for KPIs like GMROI and sell-through, and
implemented role-based access control, PCI DSS compliance, and encryption
standards.
Optimized SQL and Polars-based transformations to accelerate visualization by 60
percent, while strengthening governance with audit logging and Ranger policies
Junior Data Engineer, Confluent 02/2016 07/2018
Designed and implemented custom data integration solutions using Python and
SQL, reducing processing time and significantly improving data pipeline efficiency.
Supported daily data pipeline operations with cron jobs and early Airflow
deployments, moving toward cloud adoption.
Developed secure ingestion from APIs and FTP sources into central reporting layers,
using MongoDB and Elasticsearch for fast lookups.
Tuned SQL queries with indexes and joins, cutting dashboard latency by 60 percent,
and introduced access control, encryption, and audit logging.
Collaborated with senior engineers to introduce containerization with Docker and
automate delivery pipelines with Jenkins and GitHub Actions.
Education
Bachelor s Degree, Stockton University