Post Job Free

Resume

Sign in

Lead ETL Developer/Data Engineer

Location:
San Antonio, TX
Salary:
65/hr on C2C
Posted:
January 08, 2024

Contact this candidate

Resume:

Hariprasad Puli

Lead ETL Developer/Data Engineer

Email : ad2j20@r.postjobfree.com Phone : 609-***-****

PROFESSIONAL SUMMARY :

• Over 13 years of experience in Data Warehouse experience in managing and delivering projects using ETL tools like DataStage and Informatica Power center.

• Extensive Experience in Develop, execute and manage projects within the defined scope, timeline.

• Proficient in working with Agile and Scrum methodologies, ensuring efficient project execution and collaboration within cross-functional teams.

• Proficient in understanding business processes and functional requirements, translating them into technical requirements for successful implementation.

• Developed jobs in DBT (Data Build Tool) and configured GitLab CI/CD pipelines for efficient data transformation and deployment.

• Migrated DataStage code logics into DBT cloud for improved data transformation and management. Also have started analysis for migrate DataStage code into IBM Cloud Pak.

• Hands-on experience in designing and developing IBM DataStage 11.7, including extracting, transforming, integrating, and loading data into data warehouses, DataMarts.

• Provided data models and data maps for data marts, supporting data aggregation efforts and ensuring effective data analysis.

• Extensive use of job sequences in DataStage to control job flow, incorporating various triggers (conditional and unconditional).

• Experienced in designing and developing ETL workflows in Informatica, utilizing pushdown optimization techniques and performance tuning.

• Migrated SAS code logics into ETL tools (DataStage) to leverage their capabilities for data integration and processing.

• Proficient in Unix commands and scripting, as well as Python scripting, for automating tasks and enhancing data processing workflows.

• Expertise in scheduling DataStage jobs and Unix scripts using scheduling tools like BMC Control-M, ensuring efficient job execution and monitoring.

• Exposure Big Data Hadoop concepts, providing familiarity with modern data warehouse technologies.

• Efficient communicator with highly creative and motivated approach, aiming to achieve results and identify opportunities for improvement.

• Combines proven investigative skills with an analytical mindset, identifying opportunities and devising strategies to realize them.

• Reports team activities and progress towards goals to key stakeholders, ensuring effective communication and transparency.

TECHNICAL SKILLS:

ETL Tools : IBM Infosphere DataStage 11.7, Informatica, DBT

Data Warehouse : Snowflake Cloud.

Version Control Tools : GitHub, IBM UCD

Development Methodologies : Agile, Waterfall

Scheduling Tools : Control-M

Scripting Languages : Unix Shell, Python

Databases : Netezza, DB2, SQL Server, Oracle

Cloud Technologies : AWS EC2, S3, Snowflake, IBM Cloud Pak.

Operating Systems : Linux, UNIX, Windows 10 Server

Data Profiling and Governance tools : Information Analyzer(IA), IGC (Information Governance Catalog)

EDUCATION:

• M.C.A, from Matrusri Institute of P.G Studies, Osmania University, India, 2006

PROFESSIONAL EXPERIENCE:

Client: USAA/TCS, San Antonio, TX Nov 2019 – Till Date

Role: Lead ETL Developer/Data Engineer

Bank Credit risk databases are designed to be sources of integrated data pulled from multiple disparate systems for operational reporting and processing needs. These data sources include servicing and origination system. As such, the size of these databases could grow out of control without proper purge controls in place. The purpose of these databases is to provide operational processing and reporting capabilities for integrated data. Most operational and regulatory reporting/processes requires fairly recent data, so these limitations are not too restrictive. In the rare cases that older data is requires, that data is required, that data will have to be gathered and merged outside of these databases.

Responsibilities:

• Review of Functional and Non-functional Requirements: Participated in the review process of both functional and non-functional requirements for projects.

• Writing Complex snow SQL queries in snowflake cloud data warehouse to business analysis and reports.

• Design Discussion and Requirement Gathering: Played a key role in gathering requirements and identifying any gaps before designing DataStage jobs. Participated in design discussions to ensure the system meets the specified requirements.

• Agile Development: Designed and coded components using agile methodologies, which typically involve iterative development and collaboration within a cross-functional team.

• Review and Analysis of System Specifications: Reviewed and analyzed detailed system specifications related to DataStage ETL and other associated applications to ensure they align with business requirements.

• Impact Evaluation: Assessed the impact of proposed changes on existing DataStage ETL applications, processes, and configurations.

• Agile Ceremonies: Attended agile ceremonies, such as daily stand-ups, sprint planning, sprint reviews, and retrospectives, to collaborate with the development team.

• Data Extraction and Loading with DataStage: Utilized IBM Infosphere DataStage as an ETL tool to extract data from source systems and load it into a SQL Server database.

• Data Warehousing Principles: Demonstrated a strong understanding of data warehousing principles, including the use of fact tables, dimension tables, and star/snowflake schema modelling.

• Data Ingestion with Snowpipe: Utilized Snowpipe, a continuous data ingestion service, for loading data from an AWS S3 bucket into the target system.

• Zero-Copy Cloning: Implemented clone objects to facilitate zero-copy cloning, a technique for efficient data replication or duplication.

• Migration of SAS Code to ETL (DataStage): Converted SAS code logics into ETL processes using IBM Infosphere DataStage.

• Cloud Data Migration: Migrated on-premises data to the cloud using tools like DBT (Data Build Tool) and Snowflake. Also have started analysis for migrate DataStage code into IBM Cloud Pak.

• DataStage Environment and Configuration Management: Managed DataStage environment variables and entries in configuration files to ensure proper functioning of the system.

• Job Scheduling: Scheduled DataStage and DBT jobs using Control-M, a workload automation tool.

• Unix Scripting: Built Unix scripts based on project requirements to automate tasks or perform specific operations.

• Code Versioning and Collaboration: Utilized GitLab for the code check-in process, allowing version control and collaboration among team members.

Environment: DBT, AWS S3, EC2, Snowflake, JIRA, IBM Infosphere DataStage11.7, DB2, Control-M tool, Python Scripting, UNIX scripting, GitLab, IBM UCD.

Client: USAA/TCS, Mexico Jan 2016 – Oct 2019

Role: ETL Developer

Bank Data M&P: The Objective of this Project is to check the eligibility of the customer to provide Credit cards, Loans and many other products. Extract the data from different source systems like flat files & DB2 database and after some transformations Loading Target systems. The reporting tools provide on-demand reports to business users who use the information to study service usage patterns, design more competitive, cost-effective services, and improve customer service. The maintenance is many types such as Corrective, Preventive, Small Mod and incidents

Responsibilities:

• Workload Management: Managed assignment and workload distribution among team members to ensure project milestones and delivery goals are met.

• Big Room Planning: Participated in Big Room Planning sessions for Program Increment Planning, which involves collaborative planning and alignment of multiple teams working on a program or project.

• Data Extraction and Loading with DataStage: Utilized IBM Infosphere DataStage as an ETL tool to extract data from source systems and load it into a SQL Server database.

• Data Warehousing Principles: Demonstrated a strong understanding of data warehousing principles, including the use of fact tables, dimension tables, and star/snowflake schema modelling.

• Mapping and Transformation: Created mappings in DataStage using various transformations such as Source Qualifier, Joiner, Aggregator, Expression, Filter, Router, Lookup, Update Strategy, and Sequence Generator.

• Transformation Logic: Implemented various transformation logics using DataStage transformations such as Source Qualifier, Expression, Filter, Joiner, Lookup, Router, and Update Strategy.

• Database Configuration: Configured database connections in DataStage for Oracle, Netezza, and SQL Server.

• Migration of DataStage Jobs: Worked on migrating DataStage jobs from DB2 database connections to Netezza database connections.

• Migration of SAS Code to ETL (DataStage): Converted SAS code logics into ETL processes using IBM Infosphere DataStage.

• Used Snowpipe for continuous data ingestion from the S3 bucket.

• Created clone objects to maintain zero-copy cloning.

• Code Migration: Deploy the DataStage code into Prod using RTC tool.

• Interacting with Business Users and Analysts: Collaborated with business users and analysts to identify Critical Data Elements (CDEs) within tables and understand business requirements.

• Data Profiling and Analysis: Conducted data profiling activities, including column analysis, rule analysis, primary key analysis, natural key analysis, and foreign-key analysis using tools like IBM Information Analyzer.

• Information Analyzer Configuration: Configured projects, added data sources, wrote, configured, and executed rules or rulesets within IBM Information Analyzer for data analysis and quality assessment.

• Rule Scheduling and Monitoring: Provided scheduling details for rule sets to the IT team using Control-M tool, monitored job execution, and performed remediation in case of failures.

• Linking Technical Terms to Business Terms: Used Information Governance Catalogue (IGC) to link technical terms and concepts to business terms, facilitating understanding and alignment between technical and business stakeholders.

• Coordination and Collaboration: Collaborated with offshore and onshore teams, attended regular hand-off meetings, and shared requirements to ensure effective communication and coordination.

• Provide Mentorship and Guidance: Assign mentors or experienced team members to provide guidance and support to new team members. Encourage regular communication and feedback exchanges.

Environment: IBM Infosphere DataStage 9x, SAS, Unix Scripting, IBM Information Analyzer, Information Governance Catalogue, Netezza, SQL Server, Control-M tool, RTC (Rational Team Concert) tool.

Client: USAA/TCS, India Aug 2010 – Dec 2015

Role: ETL Developer

The goal of Enterprise Governance Risk and Compliance (EGRC) support team is to strategically provides, enhances and sustains a suite of leading IT solutions. It assures USAA adherence and state regulation by enabling business to harness the power of technology for the purpose of streamlining processes, strengthening the risk mitigation, ensuring the member confidence. The Enterprise Governance Risk and Compliance (EGRC) is a collection of high-performance applications like MetricStream (MS) which tracks and manages the USAA member complaints, Audit Control Language (ACL) helps the USAA financial and audit professionals for data extraction and data analysis.

Responsibilities:

• Business and System Requirements Analysis: Analyzed business and system requirements to understand the project scope and objectives.

• Low-Level Design and Programming Specifications: Designed Low-Level Documents (LLD) and programming specifications based on the requirements, providing detailed guidance for implementation.

• ETL Tools: Extensively used IBM Infosphere DataStage 9x and Informatica 9x as ETL tools for data integration and transformation processes.

• Data Warehousing Principles: Demonstrated a strong understanding of data warehousing principles, including the use of fact tables, dimension tables, and star/snowflake schema modeling.

• Mapping and Transformation: Created mappings using various transformations such as Source Qualifier, Joiner, Aggregator, Expression, Filter, Router, Lookup, Update Strategy, and Sequence Generator.

• Transformation Logic: Implemented various transformation logics using transformations like Source Qualifier, Expression, Filter, Joiner, Lookup, Router, and Update Strategy.

• Development and Scheduling: Developed and scheduled mappings and sessions within ETL tools, such as Informatica, for executing complex business rules.

• Deployment: Deployed Informatica objects into different environments, including TEST, UAT, and Production, ensuring proper configuration and functionality.

• Development of Parallel Jobs: Developed parallel jobs in ETL tools, utilizing different processing stages like Transformer, Aggregator, Lookup, Join, Sort, Copy, Merge, Funnel, Change Apply, and Filter.

• Coordination with Onshore Team: Collaborated with the onshore team to gather requirements, seek clarifications, and ensure alignment between teams.

• Impact Analysis: Conducted impact analysis based on system requirements, assessing the potential effects on existing processes, systems, and data.

• Unix Scripting: Built Unix scripts based on project requirements to automate tasks, perform data operations, or integrate with the ETL process.

• ETL Framework Configuration: Configured the ETL framework to run DataStage jobs efficiently, optimizing performance and resource utilization.

• Code Version Control and Defect Management: Utilized Rational Team Concert (RTC) for code version control, code merging, and defect management, ensuring code integrity and collaboration.

• Job Scheduling: Scheduled ETL jobs using scheduler tools like Control-M, coordinating their execution and dependencies.

• Specification Documents: Created specifications for ETL processes, finalizing requirements, and preparing specification documents for reference and documentation.

• Test Case Preparation and Execution: Prepared unit test cases and system test case documents, including sample data, to validate the functionality and performance of ETL processes.

• Debugging and Error Handling: Debugged and resolved errors in DataStage jobs, utilizing UNIX environment and employing various error handling techniques.

• Collaboration with Stakeholders: Interacted with business SMEs, OFSA SMEs, DBAs, system administrators, operations teams, and DI teams to gather inputs, address issues, and ensure smooth collaboration.

Environment: IBM Infosphere DataStage 9x, Informatica 9x, SQL server, Oracle, DB2, Control-M tool, Unix scripting, RTC Tool.



Contact this candidate