Role: Data Engineer
Experience: 3-5 yrs
Notice Period: Immediate
Work Mode: Hybrid
Job Description
Requirements:
• Proficient in Python (Including popular python packages e.g. Pandas, NumPy etc.) and SQL
• Strong background in distributed data processing and storage (e.g. Apache, Spark, Hadoop)
• Large scale (TBs of data) data engineering skills - Model data, create production ready ETL pipelines
• Development experience with at least one cloud (Azure high preference, AWS, GCP)
• Knowledge of data lake and data lake house patterns
• Knowledge of ETL performance tuning and cost optimization
• Knowledge of data structures and algorithms and good software engineering practices
Nice to Have:
• Experience with Azure Databricks
• Knowledge of DevOps - ETL pipeline orchestration
o Tools: GitHub actions, Terraform, Databricks workflows, Azure DevOps
• Certifications: Databricks, Azure, AWS, GCP highly preferred
• Knowledge of code version control (e.g. Git)