Data Engineer Engineering

Location:

Aurora, IL

Posted:

October 30, 2023

Contact this candidate

Resume:

Name Damaries Ongole

Email: ******************@*****.***

PH: 304-***-****

LinkedIn: https://www.linkedin.com/in/damaries-ongole-090482285

Sr. Data Engineer

PROFESSIONAL SUMMARY

Over 9+ years of experience in Data Engineer, including profound expertise and experience in traditional data engineering background with expertise in Apache Spark, PySpark, Kafka, Spark Streaming, Spark SQL, Hadoop, HDFS, Hive, Sqoop, Pig, MapReduce, Flume, Beam.

Extensive experience in relational databases including Microsoft SQL Server, Teradata, Oracle, Postgress and No SQL Databases including MongoDB, HBase, Azure Cosmos DB, AWS DynamoDB, Cassandra.

Hands on experience with Data modeling, Physical Datawarehouse designing & cloud data warehousing technologies including Snowflake, Redshift, BigQuery, Synapse.

Experience with major cloud providers & cloud data engineering services including AWS, Azure, GCP & Databricks.

Created and optimized Talend jobs for data extraction, data cleansing, and data transformation.

Designed & orchestrated data processing layer & ETL pipelines using Airflow, Azure Data Factory, Oozie, Autosys, Cron & Control-M.

Hands on experience with AWS services including EMR, EC2, Redshift, Glue, Lambda, SNS, SQS, CloudWatch, Kinesis, Step functions, Managed Airflow instances, Storage & Compute.

Hands on experience with Azure services including Synapse, Azure Data Factory, Azure functions, EventHub, Stream Analytics, Key Vault, Storage & Compute.

Hands on experience with GCP services including DataProc, VM, Big Query, Dataflow, Cloud functions, Pub/Sub, Composer, Secrets, Storage & Compute.

Hands on experience with Databricks services including Notebooks, Delta Tables, SQL Endpoints, Unity Catalog, Secrets, Clusters.

Have Extensive Experience in IT data analytics projects, Hands on experience in migrating on-premises data & data processing pipelines to cloud including AWS, Azure & GCP.

Experienced in fact dimensional modeling (Star schema, Snowflake schema), transactional modeling and SCD (Slowly changing dimension)

Hands on experience in MS SQL Server with Business Intelligence in SQL Server Integration Services (SSIS), SQL Server Analysis Services (SSAS), SQL Server Reporting Services (SSRS).

Experience in Data Governance & Master Data Management through Collibra & Informatica. Standardization to improve Master Data Management (MDM) and other common data management issues.

Strong expertise in working with multiple databases, including DB2, Oracle, SQL Server, Netezza, and Cassandra, for data storage and retrieval in ETL workflows.

Experience in usage of Hadoop distribution like Cloudera and Hortonworks.

Traced and catalogue data processes, transformation logic and manual adjustments to identify data governance issues.

Excellent knowledge of studying the data dependencies using metadata stored in the repository and prepared batches for the existing sessions to facilitate scheduling of multiple sessions.

Linked data lineage to data quality and business glossary work within the overall data governance program.

Designed and maintained data integration solutions to extract, transform, and load (ETL) data from various sources into target systems.

Experienced in working with ETL tools such as Informatica, DataStage, or SSIS (SQL Server Integration Services).

Good knowledge in Database Creation and maintenance of physical data models with Oracle, Teradata, Netezza, DB2, MongoDB, HBase and SQL Server databases.

Experienced in writing complex SQL Quires like Stored Procedures, triggers, joints, and Sub queries.

Interpret problems and provides solutions to business problems using data analysis, data mining, optimization tools, and machine learning techniques and statistics.

Large scale Hadoop environments build and support including design, configuration, installation, performance tuning and monitoring.

Extensive experience in loading and analyzing large datasets with Hadoop framework (MapReduce, HDFS, PIG, HIVE, Flume, Sqoop, SPARK, Impala, Scala), NoSQL databases like MongoDB, HBase, Cassandra.

Expert in Migrating SQL database to Azure data Lake storage, Azure Data Factory (ADF), Azure data lake Analytics, Azure SQL Database, Data Bricks and Azure SQL Data warehouse and controlling and granting database access and migrating on premise databases to Azure Data Lake store using Azure Data factory.

Integrated Kafka with Spark Streaming for real time data processing.

Skilled in performing data parsing, data manipulation and data preparation with methods including describe data contents.

Implemented data modeling and schema design for data-centric applications, ensuring optimal data storage and retrieval.

Integrated and transformed data from various sources, including databases, APIs, and file systems, to support data-driven applications.

Strong experience in the Analysis, design, development, testing and Implementation of Business Intelligence solutions using Data Warehouse/Data Mart Design, ETL, BI, Client/Server applications and writing ETL scripts using Regular Expressions and custom tools (Informatica, Pentaho, and Sync Sort) to ETL data.

Knowledge of processes followed in Data Migration project, End to End Data Warehouse Development Project and Data Warehouse Enhancement Project, Hadoop Cloudera environment.

Expert in designing Server jobs using various types of stages like Sequential file, ODBC, Hashed file, Aggregator, Transformer, Sort, Link Partitioner and Link Collector.

Hands on experience with ETL, Hadoop and Data Governance tools such as Tableau, Informatica Enterprise Data Catalog

Solid Experience and understanding of Implementing large scale Data warehousing Programs and E2E Data Integration Solutions on Snowflake Cloud, GCP, Redshift, Informatica Intelligent Cloud Services (IICS - CDI) & Informatica Power Center integrated with multiple Relational databases (MySQL, Teradata, Oracle, Sybase, SQL server, DB2)

Experience in designing & developing applications using Big Data technologies HDFS, Map Reduce, Sqoop, Hive, PySpark & Spark SQL, HBase, Python, Snowflake, S3 storage, Airflow.

Experience in doing performance tuning for map reduce jobs & hive complex queries.

Experience in efficiently doing ETL's using Spark - in memory processing, Spark SQL and Spark streaming using Kafka distributed messaging system.

Developed and implemented complex ETL workflows using Talend Data Integration.

Leveraged Alteryx caching and parallel processing techniques to improve workflow efficiency.

Extensive experience in development of Bash scripting, TSQL, and PL/SQL scripts

Understanding of structured data sets, data pipelines, ETL tools, data reduction, transformation and aggregation technique, Knowledge of tools such as DBT, DataStage.

Have good knowledge in Job Orchestration tools like Oozie, Zookeeper & Airflow.

Written PySpark job in AWS Glue to merge data from multiple tables and in Utilizing Crawler to populate AWS Glue data Catalog with metadata table definitions.

Generated a script in AWS Glue to transfer the data and utilized AWS Glue to run ETL jobs and run aggregation on PySpark code.

Excellent performance in building, publishing customized interactive reports and dashboards with customized parameters including producing tables, graphs, listings using various procedures and tools such as Tableau, PowerBI and user-filters using Tableau.

Experience in developing Map Reduce Programs using Apache Hadoop for analyzing the big data as per the requirement. Practical understanding of the Data modeling (Dimensional & Relational) concepts like Star-Schema Modeling, Snowflake Schema Modeling, Fact and Dimension tables.

TECHNICAL SKILLS

Hadoop/Spark Ecosystem

Hadoop, MapReduce, Pig, Hive/impala, YARN, Kafka, Flume, Oozie, Zookeeper, Spark, Airflow, MongoDB, Cassandra, HBase, and Storm.

Hadoop Distribution

Cloudera distribution and Horton works

Programming Languages

Scala, Hibernate, JDBC, JSON, HTML, CSS, SQL, R, Shell Scripting

Script Languages:

JavaScript, jQuery, Python.

Databases

Oracle, SQL Server, MySQL, Cassandra, Teradata, PostgreSQL, MS Access, Snowflake, NoSQL, HBase, MongoDB

Cloud Platforms

AWS, Azure, GCP, Terraform

Distributed Messaging System

Apache Kafka

Data Visualization Tools

Tableau, Power BI, SAS, Excel, ETL

Batch Processing

Hive, MapReduce, Pig, Spark

Operating System

Linux (Ubuntu, Red Hat), Microsoft Windows

Reporting Tools/ETL Tools

Informatica Power Centre, Tableau, Pentaho, SSIS, SSRS,

Power BI

PROFESSIONAL EXPERIENCE

Client: DaVita, Colorado (2022 Aug to Present)

Role: Sr Data engineer

Responsibilities:

Developed Spark applications using Pyspark and Spark-SQL for data extraction, transformation, and aggregation from multiple file formats.

Designed & Implemented Unified Data Processing Layer in Spark on AWS EMR to consolidate data from wide variety of sources.

Assisted in troubleshooting and resolving data-related issues, ensuring the stability and reliability of data-centric applications.

Design and Develop ETL Processes in AWS Glue to migrate Campaign data from external sources like S3, ORC/Parquet/Text Files into AWS Redshift

Used Spark Streaming to receive real time data from the Kafka and store the stream data to HDFS using Python and NoSQL databases such as HBase and Cassandra

Collected data using Spark Streaming from AWS S3 bucket in near-real-time and performs necessary Transformations and Aggregation on the fly to build the common learner data model and persists the data in HDFS.

Developed and maintained data catalogs and data lineage to track data sources, transformations, and usage.

Responsible for installing, configuring, supporting and managing of Cloudera Hadoop Clusters.

Designed and Developed ETL Processes in AWS Glue to migrate Campaign data from external sources like S3, ORC/Parquet/Text Files into AWS Redshift.

Involved in file movements between HDFS and AWS S3 and extensively worked with S3 bucket in AWS.

Experience with setting up and managing Databricks clusters for data processing and analysis.

Proficient in working with Databricks notebooks to develop, test, and deploy ETL pipelines using languages such as Python, Scala, or SQL.

Experienced in developing scalable and efficient data pipelines using Java frameworks like Apache Spark, Apache Flink, or Spring Batch.

Skilled in utilizing Java-based technologies for data ingestion, transformation, and loading processes.

Strong proficiency in Scala, with experience using it to build data processing pipelines, ETL workflows, and other data engineering tasks.

Leveraged Talend to orchestrate and schedule ETL jobs in AWS Data Pipeline or AWS Glue for automated data integration processes.

Experience with AWS services related to data processing, such as Amazon EMR, Amazon S3, and AWS Glue, as well as proficiency in scripting languages such as Python and Bash for data processing and automation tasks.

Developed and maintained data transformation pipelines using DBT (Data Build Tool), enabling the creation of structured, reliable, and maintainable SQL-based data models and transformations in Snowflake.

Designed and implemented Snowflake data warehousing solutions, including schema design, data modeling, and optimization for scalability and performance.

Create Snapshots of EBS Volumes and Monitor AWS EC2 Instances using Cloud Watch and worked on AWS Security Groups and their rules.

Developed calculated columns and measures using complex DAX functions and power pivot.

Validated the test data in DB2 tables on Mainframes and on Teradata using SQL queries.

Knowledge of configuring Databricks jobs and scheduling them to run on a regular basis to ensure the timely processing of data.

Expertise in integrating Databricks with other tools such as Apache Spark, Apache Hadoop, and cloud services like AWS.

Authoring Python (PySpark) Scripts for custom UDF’s for Row/ Column manipulations, merges, aggregations, stacking, data labeling and for all Cleaning and conforming tasks.

Worked on AWS CLI Auto Scaling and Cloud watch monitoring creation and update.

Handled AWS Management Tools as Cloud watch and Cloud Trail.

Stored the log files in AWS S3. Used versioning in S3 buckets where the highly sensitive information is stored.

Integrated AWS Dynamo DB using AWS lambda to store the values of items and backup the DynamoDB streams.

Designed and implemented data integration solutions using FiveTran to extract data from various sources, including databases, SaaS applications, and APIs, and load it into Snowflake data warehouse.

Leveraged Snowflake's capabilities, such as virtual warehouses, clustering, and materialized views, to optimize query performance and enable efficient data retrieval for analytics and reporting.

Collaborated with data analysts and business stakeholders to understand data requirements and translate them into Snowflake schemas and DBT models, ensuring data accuracy, consistency, and accessibility.

Implemented data quality checks and validation processes in Snowflake using Snowflake SQL, DBT tests, or custom scripts, ensuring data integrity and adherence to data quality standards.

Prepared scripts to automate the Ingestion process using Pyspark and Scala as needed through various sources such as API, AWS S3, Teradata and Redshift.

Designed and implemented complex data integration pipelines using Alteryx Designer.

Used SQL Server Management Tool to check the data in the database as compared to the requirement given.

Worked with AWS Athena and Talend to enable ad-hoc querying and analysis of data stored in Amazon S3.

Played a lead role in gathering requirements, analysis of entire system and providing estimation on development, testing efforts.

Involved in designing different components of system like Sqoop, Hadoop process involves map reduce & hive, Spark, FTP integration to down systems.

Migrated an existing on-premises application to AWS. Used AWS services like EC2 and S3 for small data sets processing and storage, Experienced in Maintaining the Hadoop cluster on AWS EMR

Strong experience and knowledge of real time data analytics using Spark Streaming, Kafka, and Flume Configured Spark streaming to get ongoing information from the Kafka and store the stream information to HDFS.

Implemented serverless data warehousing solutions on AWS using Talend and Amazon Redshift for efficient data storage and retrieval.

Utilized Alteryx tools for data cleansing, including data validation, standardization, and de-duplication.

Automated data processing workflows and orchestration using AWS Step Functions, AWS Lambda, or Apache Airflow, ensuring reliable and scalable execution of data pipelines in Snowflake.

Integrated Snowflake with other AWS services, such as S3, Glue, and Athena, to enable data ingestion, data lake integration, and seamless cross-service data querying and analysis.

Collaborated with DevOps teams to define and implement infrastructure as code (IaC) using AWS CloudFormation, Terraform, or CDK, ensuring consistent and reproducible Snowflake and data engineering environments.

Successfully completed a POC on GCP services such as Big Query, Dataflow, Pub/Sub, and Cloud Storage, demonstrating the ability to quickly learn and work with new cloud platforms.

Leveraged existing data engineering experience in AWS to identify similarities and differences between AWS and GCP services and make informed decisions regarding the most appropriate GCP services for the POC.

Enforced standards and best practices around data catalog, data governance efforts.

Created DataStage jobs using different stages like Transformer, Aggregator, Sort, Join, Merge, Lookup, Data Set, Funnel, Remove Duplicates, Copy, Modify, Filter, Change Data Capture, Change Apply, Sample, Surrogate Key, Column Generator, Row Generator, Etc.

Expertise in Creating, Debugging, Scheduling and Monitoring jobs using Airflow for ETL batch processing to load into Snowflake for analytical processes.

Hands-on experience with Connect Direct and MQFTE for secure and reliable file transfers in data integration processes.

Excellent problem-solving skills to identify and resolve issues related to data pipelines, performance, and scalability on Databricks.

Responsible for estimating the cluster size, monitoring, and troubleshooting of the Spark data bricks cluster.

Created Unix Shell scripts to automate the data load processes to the target Data Warehouse.

Responsible for implementing monitoring solutions in Ansible, Terraform, Docker, and Jenkins.

Environment: Python, Spark, AWS EC2, AWS S3, AWS EMR, AWS Redshift, AWS Glue, AWS RDS, AWS SNS, AWS SQS, AWS Athena, Snowflake, Data warehouse, Airflow, Data Governance, Kafka, ETL, Terraform, Docker, SQL, Tableau, Git, REST, Bitbucket, Jira.

CLIENT: Citi Bank, NC (2021 Dec to 2022 Aug)

Role: Sr Data Engineer

Responsibilities:

Designed and implemented end-to-end data solutions on Azure, leveraging services such as Azure Data Factory, Azure Databricks, Azure Data Lake Storage, and Azure SQL Database.

Developed data pipelines using Azure Data Factory to orchestrate data movement and transformations across diverse data sources and destinations.

Built and optimized scalable data processing workflows using Azure Databricks, leveraging Spark for data ingestion, transformation, and analysis.

Proficient in writing Spark jobs using Scala and Python in Azure Databricks notebooks for data cleansing, feature engineering, and advanced analytics.

Implemented real-time data processing solutions using Azure Event Hubs, Azure Stream Analytics, and Databricks Structured Streaming.

Designed end-to-end data pipelines in Azure Data Factory for data extraction, transformation, and loading.

Integrated Teradata as a data source within Azure Data Factory pipelines for seamless data movement.

Extensive experience in implementing data lake architectures using Azure Data Lake Storage for storing and processing large volumes of structured and unstructured data.

Designed and developed data warehousing solutions using Azure Synapse Analytics, including data modeling, SQL script development, and query optimization.

Implemented machine learning workflows on Azure using Azure Databricks and Azure Machine Learning, including data preparation, model training, and deployment.

Demonstrated experience in handling large-scale data processing using Teradata.

Integrated and leveraged Azure Cosmos DB, a globally distributed NoSQL database, to handle large-scale and highly responsive applications with low latency requirements.

Developed data ingestion and extraction processes, ensuring data quality and integrity using Azure Data Factory and related data integration technologies.

Implemented data security and access controls in Azure data products, ensuring compliance with regulatory requirements and protecting sensitive data.

Proficient in Teradata database design, development, optimization, and maintenance.

Utilized Azure Monitor to proactively monitor and troubleshoot data pipelines and services, identifying and resolving performance issues and bottlenecks.

Integrated Azure Application Insights to monitor the performance and usage of data engineering solutions, identifying areas for optimization and improvement.

Proficient in using control structures, loops, and conditional logic within Teradata stored procedures.

Utilized Azure Log Analytics to analyze and visualize log data, facilitating effective troubleshooting and monitoring of data pipelines and systems.

Proficient in implementing and managing big data clusters on Azure Databricks, including cluster provisioning, monitoring, and optimization.

Implemented data governance practices in Azure, ensuring data security, privacy, and compliance with industry standards and regulations.

Experienced in implementing data replication and synchronization using Azure Data Factory and Azure Databricks to enable hybrid data integration scenarios.

Developed automated data quality checks and monitoring solutions using Azure services such as Azure Monitor and Azure Log Analytics.

Collaborated with cross-functional teams to gather requirements, design data solutions, and deliver projects on time and within budget.

Implemented data partitioning and optimization techniques to improve data processing performance and reduce costs in Azure Databricks.

Implemented data encryption and implemented security controls in Azure to ensure data protection and compliance with organizational policies.

Experience in implementing Azure Data Bricks Auto-Scaling and Auto-Termination policies to optimize resource utilization and cost management.

Implemented CI/CD pipelines using Azure DevOps to automate deployment of data pipelines and workflows.

Implemented data lineage and metadata management solutions to track and document data transformations and lineage using Azure services.

Experience in optimizing and fine-tuning Spark jobs and SQL queries in Azure Databricks to improve performance and resource utilization.

Implemented data archival and data retention policies using Azure services such as Azure Blob Storage and Azure Data Lake Storage.

Developed and deployed machine learning models using Azure Machine Learning, integrating them into production data workflows and Databricks pipelines.

Implemented data security measures, including role-based access control (RBAC), encryption, and data masking techniques in Azure environments.

Proficient in troubleshooting and resolving issues related to Azure services, Databricks clusters, and data pipelines.

Implemented data cataloging and metadata management solutions using Azure Data Catalog and Databricks Delta Lake.

Implemented data streaming and real-time analytics solutions using Azure Event Hubs, Azure Stream Analytics, and Azure Databricks.

Experience in migrating on-premises data and workloads to Azure, including re-architecting and optimizing data processes for the cloud.

Implemented data-driven insights and visualizations using Azure services such as Azure Data Explorer, Azure Synapse Studio, and Power BI.

Implemented data access controls and auditing mechanisms to ensure data governance and compliance with regulatory requirements.

Experience in working with Azure Cognitive Services and integrating them with data pipelines for text analytics, image recognition, and natural language processing.

Stayed up to date with the latest developments in Azure and Databricks, exploring new features.

Environment: Azure SQL, Azure Storage Explorer, Azure Storage, Azure Blob Storage, Azure Backup, Azure Files, Azure Data Lake Storage, SQL Server Management Studio 2016, Teradata, Visual Studio 2015, VSTS, Azure Blob, Power BI, PowerShell, C# .Net, SSIS, DataGrid, ETL Extract Transformation and Load, Business Intelligence (BI).

Client: Mindtree, Hyderabad, India (2018 July to 2021 Sep)

Role: Senior Data Engineer

Responsibilities:

Actively Participated in all phases of the Software Development Life Cycle (SDLC) from implementation to deployment.

Involved in importing and exporting the data from RDBMS to HDFS and vice versa using Sqoop.

Worked on developing ETL processes (Data Stage Open Studio) to load data from multiple data sources to HDFS using FLUME and SQOOP, and performed structural modifications using Map Reduce, HIVE.

Migrate data from on-premises to AWS storage buckets.

Developed a python script to transfer data from on-premises to AWS S3.

Developed a python script to hit REST API’s and extract data to AWS S3.

Worked on Ingesting data by going through cleansing and transformations and leveraging AWS Lambda, AWS Glue and Step Functions.

Created YAML files for each data source and including glue table stack creation.

Worked on a python script to extract data from Netezza databases and transfer it to AWS S3.

Developed Lambda functions and assigned IAM roles to run python scripts along with various triggers (SQS, Event Bridge, SNS).

Created a Lambda Deployment function and configured it to receive events from S3 buckets.

Writing UNIX shell scripts to automate the jobs and scheduling cron jobs for job automation using commands with Crontab.

Developed various Mappings with the collection of all Sources, Targets, and Transformations using Informatica Designer.

Developed Mappings using Transformations like Expression, Filter, Joiner, and Lookups for better data messaging and to migrate clean and consistent data.

Extracted the data from Teradata into HDFS using Sqoop.

Exported the patterns analyzed back to Teradata using Sqoop.

Experience in Monitoring System Metrics and logs for any problems adding, removing, or updating Hadoop Cluster.

Involved in scheduling Oozie workflow engine to run multiple Hives and pig jobs and used Oozie workflows for batch processing and scheduling workflows dynamically.

Involved in requirement analysis, design, coding, and implementation phases of the project.

Designed and implemented Sqoop for the incremental job to read data from DB2 and load to Hive tables and connected to Tableau for generating interactive reports using Hive server2.

Used Sqoop to channel data from different sources of HDFS and RDBMS.

Developed Spark applications using Pyspark and Spark-SQL for data extraction, transformation, and aggregation from multiple file formats.

Used Spark Streaming to receive real time data from the Kafka and store the stream data to HDFS using Python and NoSQL databases such as HBase and Cassandra

Used Apache NiFi to copy data from local file system to HDP.

Worked on Dimensional and Relational Data Modeling using Star and Snowflake Schemas, OLTP/OLAP system, Conceptual, Logical and Physical data modeling using Erwin.

Automated the data processing with Oozie to automate data loading into the Hadoop Distributed File System (HDFS).

Environment: Big Data, Hadoop, Oracle12c, PL/SQL, Scala, Spark-SQL, PySpark, Python, Kafka, SAS, MDM, Oozie, SSIS, T-SQL, ETL, HDFS, Cosmos, Pig, Sqoop, MS Access.

Client: Aganitha, Hyderabad, India (2016 Jan to 2018 June)

Role: Senior Data Engineer

Responsibilities:

Involved in understanding the Requirements of the End Users/Business Analysts and Developed Strategies for ETL processes.

Proficient in designing and implementing data integration solutions using Informatica PowerCenter, IICS, Edge, EDC, and IDQ.

Developed mappings/Reusable Objects/Transformation by using mapping designer, transformation developer in Informatica Power Center.

Designed and developed ETL Mappings to extract data from flat files, and Oracle to load the data into the target database.

Extensive experience in designing, developing, and implementing ETL processes using Informatica PowerCenter.

Proven ability to analyze complex data requirements and design efficient and scalable ETL workflows to meet business objectives.

Expertise in data profiling, data cleansing, and data quality management using Informatica Data Quality (IDQ) to ensure data accuracy and consistency.

Skilled in utilizing Tivoli Workload Scheduler (TWS) for job scheduling and automation in data integration processes.

Proficient in creating mappings, workflows, and sessions in Informatica PowerCenter for data extraction, transformation, and loading.

Developed Informatica Mappings for the complex business requirements provided using different transformations like Normalizer, SQL Transformation, Expression, Aggregator, Joiner, Lookup, Sorter, Filter, and Router and so on.

Used ETL to load data using PowerCenter/Power Connect from source systems like Flat Files and Excel Files into staging tables and load the data into the target database.

Developed complex mappings using multiple sources and targets in different databases, flat files.

Developed SQL queries to develop the Interfaces to extract the data in regular intervals to meet the business requirements.

Extensively used Informatica Client Tools Source Analyzer, Warehouse Designer, Transformation Developer, Mapping Designer, Mapplet Designer, Informatica Repository.

Worked on creating Informatica mappings with different transformations like lookup, SQL, Normalizer, Aggregator, SQ, Joiner, Expression, Router etc.

Designed and developed Informatica ETL Interfaces to load data incrementally from Oracle databases and Flat files into staging schema.

Used various transformations like Unconnected/Connected Lookup, Aggregator, Expression Joiner, Sequence Generator, Router etc.

Responsible in the development of Informatica mappings and tuning for better performance.

Created transformations like Expression, Lookup, Joiner, Rank, Update Strategy and Source Qualifier transformation using the Informatica designer.

Environment: Informatica Power Center 10.4/10.2, Oracle, Flat Files, SQL, and Windows.

Client: CYIENT, India (2013March to 2015 Dec)

Role: Data Analyst

Responsibilities:

Extensive experience in designing, developing, and implementing ETL processes using SSIS for efficient data extraction, transformation, and loading.

Proficient in creating SSIS packages to integrate data from various sources into SQL. Server databases, data warehouses, and data marts.

Strong understanding of data modeling concepts and relational databases, ensuring data consistency, integrity, and performance in SSIS solutions.

Experienced in designing and implementing complex data transformations and business logic using SSIS data flow components and expressions.

Proficient in deploying and scheduling SSIS packages using SQL Server Agent, ensuring reliable and timely data processing.

Skilled in troubleshooting and resolving issues related to data quality, performance, and error

Contact this candidate