We are seeking a talented and detail-oriented Data Engineer to join our data team. In this role, you will design, build, and maintain reliable data pipelines and infrastructure to support analytics, machine learning, and business intelligence efforts. You’ll work closely with data analysts, data scientists, and software engineers to ensure high-quality, accessible, and well-organized data.
Key Responsibilities:
Design and develop scalable and robust ETL/ELT pipelines.
Integrate data from a wide variety of sources (APIs, databases, third-party platforms).
Develop and optimize data models and storage solutions (e.g., data lakes, warehouses).
Ensure data integrity, security, and compliance across systems.
Automate data workflows to support real-time and batch processing.
Monitor and troubleshoot data pipelines, ensuring minimal downtime.
Collaborate with stakeholders to understand data requirements and deliver solutions.
Maintain clear documentation of data structures, pipelines, and processes.
Qualifications:
Bachelor’s or Master’s degree in Computer Science, Engineering, or a related field.
Strong programming skills in Python, SQL, and optionally Scala or Java.
Hands-on experience with data pipeline tools like Apache Airflow, DBT, Luigi, etc.
Familiarity with cloud platforms (AWS, GCP, Azure) and services (e.g., S3, BigQuery, Redshift, Snowflake).
Solid understanding of data modeling, data warehousing, and distributed systems.
Knowledge of big data tools such as Spark, Kafka, Hadoop is a plus.
Experience with version control systems (e.g., Git) and CI/CD workflows.
Preferred Qualifications:
Experience working with real-time data streams and event-driven architectures.
Familiarity with infrastructure-as-code tools (Terraform, CloudFormation).
Knowledge of data governance, privacy, and security best practices.
Industry certifications (AWS Big Data, Google Cloud Data Engineer, etc.) are a plus.