Post Job Free
Sign in

Data Engg Python Pyspark

Company:
Virtusa Corporation
Location:
Chennai, Tamil Nadu, India
Posted:
April 25, 2024
Apply

Description:

Experience in pharma domain would be preferred.

At least 5-8 years of experience in PYSPARK, AWS glue) and capable of configuring data pipelines. Possess the following technical skills AWS,SQL, Python, Pyspark, Hive, Unix, ETL, Control-M (or similar) Data Extraction from Teradata, HIVE, Oracle, and Flat files Ability to work independently on specialized assignments within the context of project deliverables Take ownership of providing solutions and tools that iteratively increase engineering efficiencies. Design should help embed standard processes, systems and operational models into the BAU approach for end-to-end execution of Data Pipelines Proven problem solving and analytical abilities including the ability to critically evaluate information gathered from multiple sources, reconcile conflicts, decompose high-level information into details and apply sound business and technical domain knowledge Communicate openly and honestly. Advanced oral, written and visual communication and presentation skills the ability to communicate efficiently at a global level is paramount. Ability to deliver materials of the highest quality to management against tight deadlines. Ability to work effectively under pressure with competing and rapidly changing priorities. Excellent communication skills team player.

Roles and responsibilities of a PySpark developer include:

Developing unit tests for all the Spark transformations and helper methods

Building Spark jobs for data aggregation and transformation

Designing and developing data processing pipelines

Collaborating with team members to determine best practices and client requirements for software.

Developing intuitive software that meets and exceeds the needs of the company.

Schedule: Full-time

Travel: No

Apply