Post Job Free
Sign in

Python Data Engineer

Company:
FactEntry
Location:
Vellore, Tamil Nadu, India
Posted:
April 08, 2024
Apply

Description:

Python Data Engineer Job Description

Experience: 4-6 Years

About The Role

We are seeking a highly skilled Python Data Engineer to join our growing team. In this role, you will play a critical part in building, maintaining, and scaling our data infrastructure. You will be responsible for designing and implementing data pipelines, ensuring data quality, and collaborating with data scientists and analysts to unlock valuable insights from our data.

Responsibilities

Design, develop, and maintain data pipelines using Python, Apache Spark, and related libraries.

Architect and implement scalable data infrastructure solutions.

Extract, transform, and load (ETL) data from various sources including databases, APIs, and cloud storage.

Optimize data pipelines for performance and efficiency.

Cleanse and validate data to ensure accuracy and consistency.

Build and maintain data warehouses and data lakes using cloud platforms (AWS, Azure, GCP).

Partner with data scientists and business stakeholders to define data requirements.

Develop and implement data quality checks and monitoring processes.

Design and schedule data workflows using Apache Airflow or similar orchestration tools.

Collaborate with data scientists and analysts to understand data requirements and build solutions.

Document data pipelines and processes for maintainability and knowledge sharing.

Stay up-to-date with the latest trends and technologies in the data engineering field.

Identify and implement new data technologies for the team.

Advocate for best practices and contribute to the data engineering roadmap.

Required Skills

4-6 years of experience as a data engineer or related role.

Proficiency in Python programming, including data structures, algorithms, and object-oriented programming.

Strong knowledge of SQL and experience working with relational databases.

Experience with Apache Spark, PySpark, and distributed data processing frameworks.

Familiarity with cloud platforms (AWS, Azure, GCP) for data storage and processing.

Experience with data pipeline orchestration tools like Apache Airflow.

Desired Skills

Experience with data warehousing and data lake technologies.

Familiarity with data visualization tools (e.g., Tableau, Power BI).

Experience with data versioning and management tools (e.g., Git).

Excellent communication and collaboration skills.

Strong problem-solving and analytical skills.

Ability to work independently and as part of a team.

Full time

Apply