Job Title: Senior Data Engineer (AWS & Confluent Data/AI Projects)
Work Set-up: Remote/WFH (open to Filipino citizens or permanent visa holders only)
Shift Schedule: 10 am-6 pm SGT/PHT
Required Qualifications:
Bachelor's degree in Computer Science, or a related quantitative field
At least 3 years of experience in data engineering, with a significant focus on cloud-based solutions.
Strong expertise in AWS data services (S3, Glue, EMR, Redshift, Kinesis, Lambda, etc.).
Extensive hands-on experience with Confluent Platform/Apache Kafka for building real-time data streaming applications.
Proficiency in programming languages such as Python, PySpark, Scala, or Java.
Must have AWS certification / AWS Certified
Expertise in SQL and experience with various database systems (relational and NoSQL).
Solid understanding of data warehousing, data lakes, and data modeling concepts (star schema, snowflake schema, etc.).
Experience with CI/CD pipelines and DevOps practices (Git, Terraform, Jenkins, Azure DevOps, or similar).
Must have a working laptop
Preferred Qualifications (Nice to Have):
AWS Certifications (e.g., AWS Certified Data Analytics - Specialty, AWS Certified
Solutions Architect - Associate/Professional
Experience with other streaming technologies (e.g., Flink).
Knowledge of containerization technologies (Docker, Kubernetes).
Familiarity with Data Mesh or Data Fabric concepts.
Experience with data visualization tools (e.g., Tableau, Power BI, QuickSight).
Understanding of MLOps principles and tools.
Responsibilities:
Architect and Design Data Solutions: Lead the design and architecture of scalable, secure, and efficient data pipelines for both batch and real-time data processing on AWS. This includes data ingestion, transformation, storage, and consumption layers.
Confluent Kafka Expertise: Design, implement, and optimize highly performant and reliable data streaming solutions using Confluent Platform (Kafka, ksqlDB, Kafka Connect, Schema Registry). Ensure efficient data flow for real-time analytics and AI
applications.
AWS Cloud Native Development: Develop and deploy data solutions leveraging a wide range of AWS services, including but not limited to:
Data Storage: S3 (Data Lake), RDS, DynamoDB, Redshift, Lake Formation.
Data Processing: Glue, EMR (Spark), Lambda, Kinesis, MSK (for Kafka integration).
Orchestration: AWS Step Functions, Airflow (on EC2 or MWAA)
Analytics & ML: Athena, QuickSight, SageMaker (for MLOps integration).