Location/Remote: 100% remote within the United States; must be willing to work Mountain Time Zone hours
Employment Type: Permanent / Direct Hire / Full-time
Compensation: up to $120k base salary (depending on experience) + 15% annual bonus
Benefits:
100% medical premiums covered for employees
Coverage for dependents on medical, dental, vision, life, and supplemental insurances (e.g., critical illness)
Short- and Long-Term Disability (STD/LTD)
HSA & FSA options
Unlimited PTO
Up to 12 weeks paid parental leave
401(k) with 5% company match
Position Overview
We are looking for a Data Engineer with strong expertise in building ETL pipelines using AWS Glue and other native AWS services. This role is ideal for someone passionate about automating and optimizing data workflows, with deep experience in Python, SQL, and cloud-first data architecture. You will be responsible for developing and maintaining scalable data pipelines that support business-critical analytics and operational use cases.
Key Responsibilities
Design, develop, and maintain ETL pipelines using AWS Glue for batch processing of structured and semi-structured data.
Build event-driven workflows using AWS Lambda, Step Functions, and CloudWatch Events.
Leverage Amazon S3 for data lake storage and ingestion, implementing optimized partitioning and lifecycle policies.
Use Amazon Redshift and Athena for querying and transforming large datasets.
Manage access and infrastructure securely using IAM roles and policies.
Monitor data workflows and job health using CloudWatch, CloudTrail, and custom alarms.
Develop reusable Python modules to automate and streamline data processing tasks.
Perform schema design, table creation, and stored procedure development in Redshift and PostgreSQL.
Support data migrations and schema changes across Redshift clusters and other database systems.
Collaborate closely with analytics and product teams to deliver high-quality, reliable data solutions.
Required Skills & Experience
5+ years of experience developing with Python for ETL workflows, automation, and scripting
5+ years of experience working with SQL, including query optimization, stored procedure development, and data modeling
4+ years of hands-on experience working with AWS services including AWS Glue, Lambda, Step Functions, S3, Redshift, Athena, CloudWatch, and IAM
Strong understanding of building, maintaining, and optimizing data pipelines end-to-end
Experience working with semi-structured data (e.g., JSON, XML) and transforming it into structured formats
Comfortable working in fast-paced environments, owning pipelines from design to deployment
Proficient with version control tools such as Git, SVN, or similar
Excellent communication and collaboration skills, with a strong focus on data quality and reliability
Preferred Qualifications
Experience with healthcare data (HL7, Medical Claims, Rx Claims) or similar healthcare data is preferred.
Experience with Apache Airflow or Amazon MWAA
Familiarity with building custom data ingestion tools using Pandas or similar Python libraries
Exposure to telemetry data or real-time event pipelines
Experience in gaming, media, or other high-volume data environments
Education & Certifications
Bachelor’s degree in Computer Science, Engineering, Data Science, or a related technical field (or equivalent hands-on experience)
Relevant industry certifications in AWS, data engineering, or cloud technologies are a plus