Description
We are looking for a skilled Data Engineer to join our Digital Customer Solutions team. The ideal candidate should have experience in cloud computing and big data technologies. As a Data Engineer, you will be responsible for designing, building, and maintaining scalable data solutions that can handle large volumes of data. You will work closely with stakeholders to ensure that the data is accurate, reliable, and easily accessible.
Responsibilities
Design, build, and maintain scalable data pipelines that can handle large volumes of data.
Document design of proposed solution including structuring data (data modelling applying different techniques including 3-NF and Dimensional modelling) and optimising data for further consumption (working closely with Data Visualization Engineers, Front-end Developers, Data Scientists and ML-Engineers)
Develop and maintain ETL processes to extract data from various sources (including sensor, semi-structured and unstructured, as well as structured data stored in traditional databases, file stores or from SOAP and REST data interfaces)
Develop data integration patterns for batch and streaming processes, including implementation of incremental loads
Build quick porotypes and prove-of-concepts to validate assumption and prove value of proposed solutions or new cloud-based services
Define Data engineering standards and develop data ingestion/integration frameworks
Participate in code reviews and ensure all solutions are lined to architectural and requirement specifications
Develop and maintain cloud-based infrastructure to support data processing using Azure Data Services (ADF, ADLS, Synapse, Azure SQL DB, Cosmos DB).
Develop and maintain automated data quality pipelines.
Collaborate with cross-functional teams to identify opportunities for process improvement.
Manage a team of Data Engineers.
What You Will Bring
Technical and Industry Experience:
Bachelor's degree in Computer Science or related field.
7+ years of experience in big data technologies such as Hadoop, Spark, Hive & Delta Lake.
7+ years of experience in cloud computing platforms such as Azure, AWS or GCP.
Experience in working in cloud Data Platforms, including deep understanding of scaled data solutions
Experience in working with different data integration patterns (batch and streaming), implementing incremental data loads
Proficient in scripting in Java, Windows and PowerShell
Proficient in at least one programming language like Python, Scala
Expert in SQL
Proficient in working with data services like ADLS, Azure SQL DB, Azure Synapse, Snowflake, No-SQL (e.g. Cosmos DB, Mongo DB), Azure Data Factory, Databricks or similar on AWS/GCP
Experience in using ETL tools (like Informatica IICS Data integration) is an advantage
Strong understanding of Data Quality principles and experience in implementing those
Skills: code reviews,azure,python,scripting,automated data quality pipelines,mongo db,spark,etl tools,databricks,architectural and requirement specifications,adls,incremental loads,data integration,azure synapse,powershell,cloud computing,data ingestion/integration frameworks,cosmos db,scala,delta lake,bigdata,aws,hive,big data technologies,informatica iics data integration,data engineering standards,scaled data solutions,azure data factory,team management,azure data services,gcp,cloud,azure sql db,snowflake,data quality principles,hadoop,etl processes,batch and streaming processes,cloud-based infrastructure,windows,prototypes,cloud-based services,data services,sql,no-sql,data modelling,process improvement,java
Full time