Role Description
This role focuses on building and maintaining data pipelines and analytics infrastructure on AWS. You will work daily with S3, Glue, Redshift, Athena, Lake Formation, Airflow, SNS/SQS, and Postgres to make high-quality data available to analytics and ML teams.
Please note that the working hours for this job would be 5PM to 2AM IST.
Develop and maintain ETL/ELT jobs using AWS Glue and SQL/Python.
Help manage an S3-based data lake, organizing data for efficient querying via Athena and Redshift.
Build, schedule, and monitor data workflows using Apache Airflow (or a similar tool).
Apply Lake Formation policies to secure and govern data access.
Work with Postgres/PostgreSQL for operational and analytical use cases as needed.
Implement SNS/SQS-based notifications and event-driven flows within pipelines.
Collaborate with analytics and ML teams to understand data needs and deliver robust datasets.
Contribute to code reviews, documentation, and ongoing data quality checks.
Qualifications
2+ years of experience as a Data Engineer or in a similar data-focused role.
Hands-on experience with AWS data tools such as:
S3, Glue, Redshift, Athena, Lake Formation
Experience scheduling and managing pipelines with Airflow (or equivalent orchestration tool).
Solid SQL skills and familiarity with Postgres/PostgreSQL.
Understanding of data modelling, partitioning, and performance optimization.
Comfort working in a fully remote environment with Git-based workflows and CI/CD.
Nice to have:
Experience with data quality frameworks or monitoring tools.
Exposure to BI tools or ML/analytics workflows.