Post Job Free
Sign in

AWS Cloud and Spark Architect

Company:
Lovefreedom Solution
Location:
Bloomfield, CT
Posted:
August 25, 2025
Apply

Description:

Primary Job Title: Lead AWS Cloud & Apache Spark Architect

Industry & Sector: Enterprise Cloud Data Engineering and Big Data Analytics. We design, deploy and operate high-scale AWS-native data platforms and analytics pipelines for enterprise customers—supporting batch and real-time ML/BI workloads across finance, healthcare, and adtech. This is an onsite U.S. role focused on architecting secure, cost-efficient Spark-based processing at scale.

Role & Responsibilities

Architect and deliver AWS-native big data platforms and data lake solutions using S3, EMR, Glue, Redshift and EKS—designing for performance, scale and resiliency.

Lead migration efforts from on-prem Hadoop/Cloudera ecosystems to AWS (EMR/EKS/Glue), defining cutover strategies, data validation, and rollback plans.

Optimize Apache Spark (PySpark/Scala) jobs and clusters for throughput, latency and cost—tuning shuffle, partitioning, memory/executor settings and job scheduling.

Implement IaC and production-grade CI/CD for data pipelines using Terraform/CloudFormation and pipelines (Jenkins, GitLab CI), including automated testing and deployment safeguards.

Define and enforce security, governance and networking best practices (IAM, VPC design, encryption, data lineage, access controls) for enterprise workloads.

Mentor engineering teams, run architecture reviews, set operational runbooks, and drive capacity planning and observability standards.

Skills & Qualifications

Must-Have: 7+ years hands-on AWS experience (EMR, S3, Glue, Redshift, EC2) and deep Apache Spark expertise (PySpark and/or Scala) including production performance tuning and debugging.

Must-Have: Proven track record migrating on-prem Hadoop or legacy ETL to AWS and operating Spark in EMR/EKS at enterprise scale.

Must-Have: Strong IaC & CI/CD skills (Terraform/CloudFormation, Jenkins/GitLab/GitHub Actions), containerization (Docker) and Kubernetes/EKS experience.

Preferred: Experience with streaming (Kafka/Kinesis), Spark Structured Streaming, Delta Lake or Iceberg and event-driven architectures.

Preferred: Solid understanding of security & compliance (IAM, encryption, SOC2/HIPAA awareness), VPC/networking and observability tooling (CloudWatch, Prometheus, Grafana).

Preferred: Bachelor’s/Master’s in CS or related field and prior leadership/architect role in enterprise data platform projects.

Benefits & Culture Highlights

On-site U.S. role with ownership of high-impact modernization projects and visible cross-functional influence.

Engineering-first culture that values mentorship, technical excellence, and measurable business outcomes.

Learning & development support—conferences, certifications, and hands-on opportunities to build large-scale, production data systems.

Location & Work Type: United States — Onsite (candidate must be based in or willing to relocate to the U.S. and work from the office).

Keywords: AWS, Apache Spark, EMR, PySpark, Scala, Terraform, EKS, Kafka, Glue, Redshift, S3, data-lake, streaming, performance tuning, migration, IaC, CI/CD, security, observability.

Apply