Data Engineer Analyst

Location:

Prescott Valley, AZ

Posted:

July 08, 2024

Contact this candidate

Resume:

KRISHNA

SENIOR INTEGRATION ENGINEER / BIG DATA ENGINEER

Email: *******************@*****.*** phone: 469-***-****

Linked ln: www.linkedin.com/in/krishna-r-28719b182

PROFESSIONAL SUMMARY:

With over 9 years of professional experience, I am a highly skilled Senior Integration Engineer and Senior Data Engineer IT professional, specializing in designing and implementing scalable, maintainable, and secure integration solutions.

Developed, tested, and maintained robust backend services and APIs using C# and .NET Core. Ensured high performance and scalability for various enterprise applications, utilizing advanced features such as asynchronous programming, dependency injection, and Entity Framework Core.

Designed and implemented integration solutions for diverse systems such as ERP, CRM, and other enterprise applications. Leveraged the power of C# and .NET to build seamless interconnectivity and data flow, using technologies such as ASP.NET Core, Web API, and WCF (Windows Communication Foundation).

Designed and implemented RESTful APIs and managed them using API gateways. Utilized C# and .NET to ensure secure and efficient access to backend services, incorporating best practices for API security, versioning, and documentation with tools like Swagger and API Management (APIM).

Designed, developed, and optimized data pipelines and workflows for processing large volumes of data using Azure Data Factory, Azure Synapse Analytics, and Azure Databricks. Ensured data integrity, quality, and security across all data integration processes with tools such as Azure Data Lake Storage and Azure SQL Database.

Implemented data transformation and cleansing processes using Azure Data Flow and Azure Functions. Utilized Azure Monitor and Azure Log Analytics to monitor data pipeline performance and troubleshoot issues. Developed and maintained CI/CD pipelines using Azure DevOps to automate the build, test, and deployment processes

Worked with Informatica PowerCenter, offering a robust ETL toolset to extract, transform, and load data from various sources to target systems. Utilized Informatica Cloud for seamless integration of cloud-based applications and data sources with on-premises systems, facilitating real-time data synchronization.

Leveraged Informatica Intelligent Cloud Services (IICS) to automate data integration and management tasks, improving efficiency and reducing errors. Employed Informatica Data Quality to assess data quality, cleanse, and standardize data, ensuring accuracy and consistency.

Collaborated with data scientists, analysts, and other stakeholders to understand data requirements and provide appropriate solutions using Azure Machine Learning and Azure Cognitive Services. Integrated on-premises systems with cloud-based solutions, ensuring seamless data flow and interaction. Implemented event-driven architectures for real-time data processing and analytics.

Troubleshot and resolved integration issues to maintain system stability and performance. Worked closely with cross-functional teams, including software engineers, data engineers, DevOps, and product managers, to understand integration requirements and deliver effective solutions.

Utilized Power BI and Tableau for creating insightful dashboards and interactive reports. Proficient in DAX for data analysis and reporting. Developed custom data quality rules and metrics to proactively identify and resolve issues, enforcing data governance policies using Master Data Management (MDM).

I have worked extensively with various programming languages, including Python, SQL, Scala, Java, R, Pig, HiveQL, C, and C++. My experience encompasses both NoSQL databases, such as MongoDB, DynamoDB, HBase, and Cassandra, and relational databases, including SQL Server, MySQL, and PL/SQL. I have utilized monitoring and reporting tools such as Power BI, Tableau, and custom shell scripts to ensure effective data visualization and analysis.

EDUCATIONAL DETAILS (Bachelors): GUDLAVALLERU ENGINEERING COLLEGE, VIJAYAWADA, INDIA

TECHNICAL SKILLS:

Cloud Platforms: Amazon AWS, Microsoft Azure, GCP

Big Data Technologies: Hadoop ecosystem, Cloudera CDP, Hortonworks HDP, Spark, Kafka

Programming/Scripting Languages: Python, SQL, Scala, Java, .NET Core, TypeScript, JavaScript, R, C, C++, C#

NoSQL Databases: Azure Cosmos DB, MongoDB, DynamoDB, HBase, Cassandra

Integration Platforms: Informatica, MuleSoft, Apache Kafka, RabbitMQ, Apache NIFI, Oracle EBS, SAP ECC, S/4HANA, SAP OCM, SOM

Web Services: RESTful APIs, SOAP, Web Hooks

Relational Databases: Azure SQL Database, SQL Server, Oracle, MySQL, PL/SQL

Development Environments: Local development setups (Windows, macOS, Linux), Virtual Machines, Dockerized environments

CI/CD: Jenkins, Azure DevOps, GitHub Actions, TeamCity

Monitoring/Reporting Tools: Power BI, Tableau, Azure Monitor, Log Analytics

Java Technologies: Spring, Hibernate, Struts

Web Development Technologies: HTML, CSS, JavaScript, jQuery

Development/Build Tools: PyCharm, Anaconda, Eclipse, IntelliJ, Maven, Gradle

Operating Systems: Linux, Unix, Windows, macOS

Version Control Systems: Git, Bitbucket, SVN

Methodologies: Agile, Waterfall

PROFESSIONAL EXPERIENCE:

Client: CIGNA JAN 2023 – Present

Role: Senior Integration Engineer / Big Data Engineer

Responsibilities:

Developed, tested, and maintained backend services and APIs using C# and .NET Core. Ensured high performance and responsiveness of backend systems. Designed and implemented real-time integration solutions to support various business processes. Utilized message brokers like Azure Service Bus or Kafka for real-time data processing.

Applied strong understanding of Informatica to support data integration tasks, ensuring data quality and efficiency. Collaborated with data engineering teams to integrate Informatica

workflows with backend services. Developed scalable backend solutions that could handle high loads and ensure efficient data processing.

Optimized code and database queries for performance and efficiency. Ensured seamless data flow between systems in real-time, minimizing latency and maximizing data accuracy. Implemented and maintained data synchronization mechanisms to keep data consistent across systems.

Designed, implemented, and optimized complex SQL queries for efficient data retrieval and manipulation. Performed database tuning and troubleshooting to ensure optimal performance. Implemented unit tests, integration tests, and performance tests to ensure the reliability and quality of backend services.

Developed and managed data pipelines using Azure Data Factory, Azure Databricks, and Azure Synapse Analytics. Implemented data storage solutions using Azure SQL Database, Azure Data Lake, and Cosmos DB. Leveraged Informatica tools to support ETL processes, ensuring data quality and consistency.

Collaborated with data analysts and developers to integrate Informatica workflows into Azure data pipelines. Designed and implemented real-time data ingestion and processing solutions using Azure Stream Analytics and Event Hubs. Ensured seamless data flow and integration across various data sources and destinations.

Architected scalable data solutions to handle large volumes of data efficiently. Optimized data storage and retrieval processes to improve overall system performance. Ensured compliance with data governance policies and security standards. Implemented data encryption, masking, and anonymization techniques as needed.

Integrated various backend systems and third-party services to support business operations. Developed and maintained integration documentation and diagrams. Worked closely with front-end developers, data engineers, and business stakeholders to deliver integrated solutions. Provided technical support and troubleshooting for integration issues.

Environment: Microsoft Azure – Datalake storages, Azure Data Factory (ADF), Azure SQL Database, Cosmos DB, Azure Function apps, Azure Logic Apps, Azure Web apps, Azure Blob, Azure DevOps, Git, Azure Databricks, C#, Microsoft .NET Core Framework, Informatica, MuleSoft, Apache Kafka, RabbitMQ, Apache NIFI, Oracle EBS, SAP ECC, S/4HANA, SAP OCM, SOM.

Client: JOHNDEERE SEP 2019 to DEC 2022

Role: Senior Integration Engineer / Big Data Engineer

Responsibilities:

Developed, tested, and maintained backend services and APIs using C# and .NET Core, ensuring high performance and responsiveness of backend systems. Designed and implemented real-time integration solutions to support various business processes. Utilized message brokers like Google Cloud Pub/Sub or Apache Kafka for real-time data processing.

Applied strong understanding of Informatica to support data integration tasks, ensuring data quality and efficiency. Collaborated with data engineering teams to integrate Informatica workflows with backend services. Developed scalable backend solutions that could handle high loads and ensure efficient data processing.

Developed and managed data pipelines using Google Cloud Dataflow, Dataproc, and Big Query. Implemented data storage solutions using Google Cloud Storage, Bigtable, and Firestore. Leveraged Informatica tools to support ETL processes, ensuring data quality and consistency.

Collaborated with data analysts and developers to integrate Informatica workflows into GCP data pipelines. Designed and implemented real-time data ingestion and processing solutions using Google Cloud Pub/Sub and Dataflow. Ensured seamless data flow and integration across various data sources and destinations.

Architected scalable data solutions to handle large volumes of data efficiently. Optimized data storage and retrieval processes to improve overall system performance.

Environment: Google Cloud Platform (GCP) – Google Cloud Storage, Google Cloud Dataflow, Big Query, Bigtable, Firestore, Google Cloud Functions, Google Cloud Run, Google Kubernetes Engine (GKE), Google Pub/Sub, Apache Kafka, RabbitMQ, Apache NIFI, Oracle EBS, SAP ECC, S/4HANA, SAP OCM, SOM, Azure DevOps, Git, C#, Microsoft .NET Core Framework, Informatica, MuleSoft.

CLIENT: LEIDOS SEP 2017 to AUG 2019

Role: Big Data Engineer / Junior API Integration Engineer

Responsibilities:

Developed batch and stream processing applications, implementing functional pipelining through Spark APIs. Created Databricks notebooks for data preparation, including cleansing, validation, and transformations. Contributed to building a data pipeline and conducting analytics using the AWS stack (EMR, EC2, S3, RDS, Lambda, SQS, Redshift).

Developed and managed data pipelines using AWS Data Pipeline, EMR, and Redshift. Implemented data storage solutions using Amazon S3, DynamoDB, and RDS. Leveraged Informatica tools to support ETL processes, ensuring data quality and consistency. Implemented unit tests, integration tests, and performance tests to ensure the reliability and quality of backend services

Developed, tested, and maintained backend services and APIs using C# and .NET Core, ensuring high performance and responsiveness of backend systems. Designed and implemented real-time integration solutions to support various business processes, utilizing message brokers like AWS SNS and SQS or Apache Kafka for real-time data processing.

Optimized code and database queries for performance and efficiency. Implemented and maintained data synchronization mechanisms to keep data consistent across systems. Designed, implemented, and optimized complex SQL queries for efficient data retrieval and manipulation. Performed database tuning and troubleshooting to ensure Performance.

Collaborated with data analysts and developers to integrate Informatica workflows into AWS data pipelines. Designed and implemented real-time data ingestion and processing solutions using AWS Kinesis and Lambda. Ensured seamless data flow and integration across various data sources and destinations.

Environment: Amazon Web Services (AWS) – Amazon S3, AWS Data Pipeline, Amazon Redshift, DynamoDB, RDS, AWS Lambda, AWS Kinesis, AWS EMR, AWS SNS, AWS SQS, Apache Kafka, RabbitMQ, Apache NIFI, Oracle EBS, SAP ECC, S/4HANA, SAP OCM, SOM, Azure DevOps, Git, C#, Microsoft .NET Core Framework, Informatica.

CLIENT: DIRECT TV JUNE 2015 to JULY 2017 Role: Big Data Engineer

Responsibilities:

Enhanced the performance and scalability of data processing jobs by optimizing Spark configuration parameters, including executor memory, executor cores, and shuffle partitions. Implemented data partitioning and bucketing techniques in Spark to distribute and organize data across clusters, facilitating parallel processing and reducing latency.

Improved query performance and minimized resource consumption by fine-tuning Spark SQL queries, optimizing joins, aggregations, and data filtering operations. Utilized Spark's caching mechanisms, such as RDD or Data Frame caching, to persist intermediate results in memory, preventing redundant computations and expediting data processing.

Developed a centralized data engineering framework using technologies like Apache Airflow or Luigi, providing a standardized approach for defining, scheduling, and monitoring ETL workflows across teams. Designed and implemented reusable data transformation modules or libraries to promote code reusability and streamline development across multiple ETL projects.

Integrated version control systems like Git into the data engineering workflow for proper code management, collaboration, and versioning across different teams and projects. Collaborated with Data Architects to establish data modeling best practices and guidelines, ensuring consistency and scalability in data structures across various ETL processes.

I gained experience working with the Microsoft .NET Core Framework (version 6 or higher), Angular, Entity Framework, Typescript, JavaScript, and Bootstrap.

Explored and implemented cloud-based big data technologies such as AWS EMR, Google Cloud Dataproc, or Azure Databricks to leverage scalability and elasticity for large-scale data processing. Utilized containerization technologies like Docker or Kubernetes to create portable and scalable data processing environments, ensuring efficient resource utilization and easy deployment across different clusters. Familiar with Apache NIFI.

Implemented data quality monitoring and validation mechanisms, ensuring the integrity and accuracy of processed data using tools like Apache Griffin or custom validation scripts. Kept abreast of emerging big data technologies and frameworks, continuously exploring and evaluating new tools and techniques to enhance the scalability and performance of data pipelines.

Environment: Apache Spark, Apache Kafka, AWS Glue, AWS EMR, Amazon S3, Amazon Redshift, AWS Lambda, AWS CloudWatch, AWS CloudFormation, Microsoft .NET Core Framework (version 6 or Higher), Angular, Entity Framework, SAP ECC, S/4HANA, SAP OCM, SOM, Apache NIFI, SSIS, CUI, Oracle EBS, Typescript, JS, Bootstrap, Apache Airflow, Terraform, Apache Nifi, Teradata, Snowflake

Contact this candidate