Big Data Engineer

Location:

Hyderabad, Telangana, India

Posted:

March 25, 2024

Contact this candidate

Resume:

Merlin Abraham

***************@*****.***

+1-980-***-****

LinkedIn- www.linkedin.com/in/merlin-abraham-513750293

Data Engineer

Professional Summary:

●10 years of experience in IT, which includes experience in Big Data Technologies, Hadoop ecosystem, Data Warehousing, SQL related technologies.

●Experience in Big Data Analytics using Various Hadoop eco-systems tools and Spark Framework and currently working on Spark and Spark Streaming frameworks extensively using Scala as the main programming dialect.

●Experiences on BI Application design on MicroStrategy, Tableau and Power BI.

Core Competencies:

●Proficient in Hive, Oracle, SQL Server, SQL, PL/SQL, T-SQL and in managing very large databases

●Experience writing in house UNIX shell scripts for Hadoop Big Data Development

●Have good experience working with Azure BLOB and Data lake storage and loading data into Azure SQL Synapse analytics (DW).

●Proficient writing complex spark (pyspark) User defined functions (UDFs), Spark SQL and HiveQL.

●Experience working on Azure Services like Data Lake, Data Lake Analytics, SQL Database, Synapse, Data Bricks, Data factory, Logic Apps and SQL Data warehouse and GCP services Like Big Query, Dataprocessing, Pub sub etc.

●Experience in developing data pipeline using Pig, Sqoop, and Flume to extract the data from weblogs and store in HDFS and accomplished developing Pig Latin Scripts and using HiveQL for data analytics

●Extensively dealt with Spark Streaming and Apache Kafka to fetch live stream data

●Designed, developed, and maintained Spring Boot applications for Kafka data processing, including both producers and consumers.

●Experience in AWS Computing services, such as EC2, S3, EBS, VPC, ELB, Route53, Cloud Watch, Security Groups, EKS, IAM, CloudFront, RDS and Glacier.

●Proficient in ETL tools such as Oracle Data Integrator (ODI) for data extraction, transformation, and loading processes.

●Have experience in Dimensional Modelling using Snowflake schema methodologies of Data Warehouse and Integration projects.

●Expertise in automating builds and deployment processes using Bash, Python and Shell scripts with focus on CI/CD, AWS Cloud Architecture.

●Developed and documented Disaster Recovery (DR) environment requirements for critical data engineering systems, ensuring the organization's ability to recover in the event of a disruption.

●Experience in Developing Spark applications using PySpark, and Spark-SQL in Databricks for data extraction, transformation, and aggregation from multiple file formats(structured/unstructured) for analysing and transforming the data to uncover insights into the customer usage patterns.

●Experience in Data Warehousing, Data Mart, Data Wrangling using Azure Synapse Analytics

●Worked in container-based technologies like Docker, Kubernetes and Openshift.

●Created multiple MapReduce Jobs using Java API, Pig and Hive for data extraction

●Proficient in Oracle Data Integrator (ODI) with over 9 years of hands-on experience in designing, developing, and implementing ETL solutions using ODI.

●Experience in managing and securing the Custom AMI's, AWS account access-using IAM.

●Experienced in ExtractTransform and Load (ETL) processing large datasets of different forms including structured, semi-structured and unstructured data

●Hands on experience on Google Cloud Platform (GCP in all the big data products Big Query, Cloud DataProc, Google Cloud Storage, Composer (AirFlow as a service)

●Experience with CI/CDpipelines with Jenkins, Bitbucket, GitHub etc.

●Hands on experience in SQL and NOSQL database such as Snowflake, HBase, Cassandra and MongoDB.

●Extensive experience in agile software development methodology.

●Enthusiastic learner and excellent problem solver.

●Demonstrated expertise in managing the entire ETL lifecycle, including requirement gathering, data modeling, ETL design, development, testing, deployment, and maintenance using ODI.

●Strong expertise in troubleshooting and performance fine-tuning Spark, Map Reduce and Hive applications

●Good experience on working with AmazonEMR framework for processing data on EMR and EC2 instances

●Hands on experience in developing SPARK applications using Spark tools like RDD transformations, Spark core, Spark Streaming and SparkSQL

●Hands-on experience working in GCP services like Big Query, Cloud Storage (GCS), cloud function, cloud dataflow, Pub/sub, Cloud Shell, GSUTIL, Big Query, Data Proc, Operations Suite (Stack driver).

●Extensive experience in developing applications that perform DataProcessing tasks using Teradata, Oracle, SQL Server and MySQL database

●Successfully led and executed numerous complex data integration projects utilizing ODI, involving large volumes of structured and unstructured data from diverse sources.

●Worked on data warehousing and ETL tools like Informatica, Tableau, and Qlik Replicate.

●Experience in understanding the security requirements for Hadoop and integrate with Kerberos authentication and authorization infrastructure

●Acquaintance with Agile and Waterfall methodologies. Responsible for handling several clients facing meetings with great communication skills

Technical Skills:

Big Data Technologies

HDFS, MapReduce, Hive, Pig, Sqoop, Flume, Oozie, Zookeeper, Kafka, Cassandra, Apache Spark, Spark Streaming, HBase, Flume, Impala

Hadoop Distribution

Cloudera, Horton Works, Apache

Languages

Java, SQL, PL/SQL, Python, Pig Latin, HiveQL, Scala, Regular Expressions, NoSql

Web Technologies

HTML, CSS, JavaScript, XML, JSP, Restful, SOAP

Operating Systems

Windows (XP/7/8/10), UNIX, LINUX, UBUNTU, CENTOS.

Portals/Application servers

WebLogic, WebSphere Application server, WebSphere Portal server, JBOSS, TOMCAT

Build Automation tools

SBT, Ant, Maven

Version Control

GIT

IDE & Build Tools, Design

Eclipse, Visual Studio, Junit, IntelliJ, PyCharm.

Databases

MS SQL Server 2016/2014/2012, Azure SQL DB, Azure Synapse. MS Excel, MS Access, Oracle 11g/12c, Cosmos DB

Cloud

AWS web services, MS Azure, GCP, AWS Redshift

Professional Experience:

Senior Data Engineer

Truist Bank, Charlotte, NC November 2021 to Present

Responsibilities:

●Involved in complete BigData flow of the application starting from data ingestion from upstream to HDFS, processing and analysing the data in HDFS

●Conducted quantitative analysis on cash products, ETD, and OTD to identify market trends, patterns, and opportunities for optimization.

●Developed and maintained reporting and analytics solutions to monitor trade lifecycle metrics, track trade processing KPIs, and generate insights for investment decision-making.

●Led a team of 10 data engineers in the successful implementation of a cloud-based data lake, resulting in a 30% increase in data processing efficiency.

●Experience in analysing data from Azure data storages using Databricks for deriving insights using Spark cluster capabilities.

●Extract Transform and Load data from Sources Systems to Azure Data Storage services using a combination of Azure Data Factory, data bricks, Pyspark, Spark SQL and U-SQL Azure Data Lake Analytics. Data Ingestion to one or more Azure Services - (Azure Data Lake, Azure Storage, Azure SQL, Azure DW) and processing the data in In Azure Databricks.

●Developed Scripts for deploying the Pipeline in Azure Data Factory (ADF) that process the data using the Sql Activity.

●Skilled in performance tuning and optimization of GraphQL APIs, including minimizing query complexity, reducing response times, and implementing caching strategies.

●Wrote, compiled, and executed programs as necessary using Apache Spark in Scala to perform ETL jobs with ingested data.

●Developed and deployed automated trade processing solutions leveraging data engineering tools and technologies such as Apache Spark, Python, and SQL.

●Developed SparkAPI to import data into HDFS from Teradata and created Hive tables

●Improved the performance of queries against tables in entrise data warehouse in Azure Synapse Analytics by using table partitions

●Involved in running all the Hive scripts through Hive, Impala, Hive on Spark And some through SparkSQL

●Expertise in managing logs through Kafka with Logstash

●Monitored and evaluated the performance of cash and derivative portfolios using statistical measures, ensuring alignment with investment objectives.

●Integrated SAP HANA with other data sources and systems, such as enterprise applications, data warehouses, and external data providers, to enable seamless data flow and interoperability.

●Developed Python scripts for data processing tasks, including cleaning, normalization, and validation, ensuring data quality.

●Utilized PL/SQL for Extract, Transform, Load (ETL) processes, ensuring seamless data flow between various systems and databases.

●Engineered data pipelines to ingest, cleanse, and transform trade data from multiple sources, enabling real-time or batch processing of trades for timely execution and settlement.

●Migrated data warehouses to Snowflake Data warehouse.

●Integrated Python with big data technologies such as Apache Spark or Hadoop for scalable and distributed data processing.

●Implemented Azure AD to enforce role-based access control (RBAC) policies, defining granular permissions and roles for data engineers, analysts, and administrators based on their responsibilities and requirements.

●Big data-Hadoop (Map reduce & Hive), Spark ( SQL, Streaming), Azure Cosmos DB, SQL Data ware houses, Azure DMS, Azure Data Factory, Athena, Lambda, Step Functions and SQL.

●Implemented best practices for data governance, resulting in a 20% reduction in data errors.

●Implemented data ingestion pipelines using Snow Pipe for seamless and automated loading of data into Snowflake, enhancing overall data processing efficiency.

●Tera Databricks, Azure Storage Account etc. for source stream extraction, cleansing, consumption and publishing across multiple user bases.

●Proficient in installing, configuring and using Apache Hadoop ecosystems such as MapReduce, Hive, Pig, Flume, Yarn, HBase, Sqoop, Spark, Storm, Kafka, Oozie, and Zookeeper.

●Collaborated with business stakeholders and subject matter experts to understand trade lifecycle workflows, identify data requirements, and develop scalable solutions to streamline trade processing operations.

●Extracted data from HDFS using Hive, Presto and performed data analysis using Spark with Scala, PySpark and feature selection and created nonparametric models in Spark.

●Developed and managed data models using dbt, defining relationships and hierarchies that reflect business logic for analytical purposes.

●Integrated Jest with continuous integration (CI) pipelines such as Jenkins or GitLab CI to automate the execution of test suites and ensure code quality throughout the development lifecycle.

●Designed and implemented automated data ingestion pipelines using Snow Pipe, reducing manual intervention and improving data loading efficiency.

●Developed data mapping, data governance, transformation and cleansing rules involving OLTP and OLAP.

●Experiences on BI Application design on MicroStrategy, Tableau and Power BI.

●Oversaw end-to-end ETL pipeline development, ensuring data accuracy and timely delivery of projects.

●Transformed and Copied data from the JSON files stored in a DataLake Storage into an Azure Synapse Analytics table by using Azure Databricks

●Enforced security measures within Datamarts, defining access controls and ensuring compliance with privacy regulations.

●Leveraged Azure AD reporting features such as Azure AD Sign-Ins and Azure AD Audit Logs to analyze user authentication trends, identify anomalies, and investigate security incidents in data engineering environments

●Created Pipelines in ADF using Linked Services/Datasets/Pipeline/ to Extract, Transform and load data from different sources like Azure SQL, Blob storage, Azure SQL Data warehouse, write-back tool and backwards.

●Developed Spark applications using Pyspark and Spark-SQL for data extraction, transformation, and aggregation from multiple file formats for analyzing & transforming the data to uncover insights into the customer usage patterns.

●Created data bricks notebooks using Python (PySpark),Scala and Spark SQL for transforming the data that is stored in Azure Data Lake stored Gen2 from Raw to Stage and curated zones.

●Deployed Apache Spark clusters on Kubernetes for distributed data processing.

●Managed dependencies between different dbt models, ensuring proper execution order and maintaining a logical flow in the data transformation pipeline.

●Implemented and maintained metadata management practices within Snowflake, ensuring comprehensive documentation of data structures, transformations, and lineage.

●Utilized Oracle 19c's in-memory capabilities to enable real-time analytics on large datasets.

●Strong Experience in Data Migration from RDBMS to Snowflake cloud data warehouse

●Performed data profiling and transformation on the raw data using Python.

●Worked with Azure BLOB and Data lake storage and loading data into Azure SQL Synapse Analytics (DW).

●Worked on the creation of custom Docker container images, tagging and pushing the images and Docker consoles for maintaining the application of life cycle.

●Implemented custom SQL transformations in dbt for complex calculations or transformations that go beyond standard dbt functionality.

●Experience building microservices and deploying them into Kubernetes cluster as well as Docker Swarm.

●Orchestrated number of Sqoop and Hivescripts using Oozie Workflow and scheduled using Oozie coordinator dat

●Implemented real-time data integration solutions using IICS to enable the seamless flow of data in near real-time, supporting timely decision-making and analytics.

●Implement RDD/Datasets/Data frame transformations in Scala through Spark Context and HiveContext

●Used Jira for bug tracking and BitBucket to check-in and checkout code changes.

●Integrated dbt with popular data warehouses such as Snowflake, BigQuery, or Redshift to perform transformations directly in the data warehouse environment.

●Designed columnar families in Cassandra and Ingested data from RDBMS, performed data transformations, and then exported the transformed data to Cassandra as per the business requirement.

●Experienced in version control tools like GIT and ticket tracking platforms like JIRA.

Environment: HDFS, Devops, NoSql, Yarn, MapReduce, Hive, Sqoop, Flume, Oozie, Quilk Replica, Azure Data Factory, HBase, Kafka, Impala, SparkSQL, Spark Streaming, Eclipse, Jira, Scala, JSON, Oracle, Teradata, CI/CD, PL/SQL UNIX Shell Scripting, Cloudera, MS Azure.

Senior Data Engineer

Ascena Retail Group, Patskala, Ohio May 2019 to October 2021

Responsibilities:

●Created S3 buckets and managed policies for S3 buckets and Utilized S3 buckets for storage and backup on ju

●Extensive knowledge in migrating applications from internal data center to AWS

●Using the JSON and XML SerDe's for serialization and deserialization to load JSON and XML data into HIVE tables

●Led the design and implementation of data management solutions to support the entire trade lifecycle, including trade capture, validation, enrichment, and settlement processes.

●Utilized root cause analysis to identify bottlenecks and inefficiencies in data processing pipelines, databases, or ETL workflows, leading to performance optimizations and improved throughput.

●Designed and deployed personalized recommendation engines powered by Gen AI Analytics, enhancing user experiences and driving engagement through predictive content delivery and product suggestions.

●Involved in migration of ETL processes from Oracle to Hive to test the easy data manipulation.

●Implemented to reprocess the failure messages in Kafka using offset id

●Led cross-functional teams in the successful implementation of SAP HANA projects, delivering high-quality solutions on time and within budget.

●Increased data processing efficiency by 30% through the optimization of Java-based algorithms.

●Extensive experience in integrating ODI with various data sources, databases, applications, and third-party systems, including Oracle Database, SQL Server, SAP, Salesforce, and more.

●Utilized AWS Glue as a serverless data integration service, eliminating the need for infrastructure management and allowing seamless scaling based on workload demands.

●Designed and maintained data repositories and data models to store and manage investment data, including trade details, market data, reference data, and counterparty information.

●Integrated data warehouse seamlessly with business intelligence tools such as Tableau, MicroStrategy, or Power BI, providing users with intuitive dashboards and reports.

●Implemented robust security measures in IBM DB2 databases to ensure compliance with industry regulations.

●Implemented real-time data integration solutions using SSIS to capture and process streaming data, enabling timely insights and decision-making for operational requirements.

●Used HiveQL to analyze the partition and bucket data, Execute Hive Queries on Parquet tables stored in Hive to perform data analysis to meet the business specification logic

●Leveraged SAP BusinessObjects (SAP BOBJ) or SAP Analytics Cloud (SAC) for data visualization, dashboarding, and reporting purposes, providing actionable insights to stakeholders.

●Implemented and optimized data processing workflows on AWS EMR, leveraging Hadoop, Spark, and other big data technologies to handle large-scale data sets efficiently.

●Designed and implemented optimized data models for trade and transaction data within Investment Data Management, facilitating efficient storage, retrieval, and reporting capabilities.

●Led the implementation and management of Amazon Aurora as a key component of the data infrastructure, ensuring reliable and scalable relational databases.

●Developed Sqoop Jobs to load data from RDBMS to external systems like HDFS and HIVE

●Utilized Node.js to implement ETL processes for handling data extraction, transformation, and loading, particularly in scenarios where lightweight data processing is needed.

●Implemented data security measures and access controls in SAP HANA to ensure data confidentiality, integrity, and compliance with regulatory requirements.

●Implemented unit tests, integration tests, and end-to-end tests using Jest to validate data transformations, aggregations, and loading processes within the data ecosystem.

●Created custom Python scripts for specific data transformations, adapting to unique project requirements.

●Engineered complex data transformations and data cleansing routines within PL/SQL procedures for improved data quality.

●Implemented AWS Organization to centrally manage multiple AWS accounts including consolidated billing and policy-based restrictions.

●Implemented change management processes for Datamarts, accommodating modifications in business requirements while maintaining data integrity.

●Led the implementation of Databricks for big data processing, leveraging Spark clusters for scalable and distributed data transformation.

●Worked on converting the dynamic XMLdata for injection into HDFS

●Developed data transformation scripts using PySpark and SparkSQL within AWS Glue, enabling complex transformations on large datasets with ease.

●Leveraged Amazon SQS for workload distribution and load balancing in data processing workflows, ensuring that tasks are efficiently distributed across multiple workers or processing nodes.

●Responsible to manage data coming from different sources and for implementing MongoDB to store and analyze unstructured data

●Implemented various Hive queries for analytics and called them from a java client engine to run on different nodes. Worked on writing APIs to load the processed data to HBase tables.

●Was involved in writing pyspark User Defined Functions(UDF’s) for various use cases and applied business logic wherever necessary in the 89890 process.

●Led the migration of on-premises big data processing infrastructure to the cloud, architecting and deploying scalable data processing solutions using AWS EMR, resulting in reduced operational costs and improved performance

●Developed and optimized data processing logic using PySpark and Spark SQL within Databricks notebooks, ensuring efficient and performant ETL processes.

●Responsible for architecting and implementing very large-scale data intelligence solutions around Snowflake Data Warehouse

●Wrote spark SQL and spark scripts(pyspark) in databricks environment to validate the monthly account level customer data stored in S3.

●Leveraged Dbt to define and manage data models, transformations, and business logic in SQL-based code, promoting code reusability and maintainability.

●Experience in connecting Tableau to various data sources, including databases, data warehouses, and cloud platforms, for seamless data integration.

●Employed SSIS for data cleansing and transformation tasks, standardizing data formats, and enforcing quality standards to enhance data accuracy and reliability.

●Utilized pandas to merge and join datasets from different sources. This helped ensure that the integrated data was coherent and could be efficiently loaded into the target database.

●Incorporated GraphQL's input validation capabilities to ensure that only valid and expected data is processed, enhancing the overall quality and integrity of the data.

●Utilized Kubernetes to orchestrate complex data workflows, coordinating the execution of different tasks and dependencies within a distributed environment.

●Hands on experience working Amazon Web Services (AWS) using Elastic Map Reduce (EMR), Redshift, and EC2 for data processing.

●Expertise in automating builds and deployment processes using Bash, Python and Shell scripts with focus on CI/CD, AWS Cloud Architecture.

●Implemented Spark Scripts using Scala, SparkSQL, NoSql to access hive tables into spark for faster processing of data.

●Implemented error handling mechanisms and reconciliation processes to detect and resolve discrepancies in trade data, minimizing operational risks and ensuring data integrity throughout the trade lifecycle.

●Contributed to the design and development of data warehousing solutions, using PL/SQL to create and maintain data warehouse structures.

●Integrated Amazon SNS and SQS with AWS Lambda functions to enable serverless data processing workflows, allowing for automatic scaling and cost-effective execution based on demand.

●Integrated Software as a Service (SaaS) applications seamlessly using IICS, enabling the flow of data between cloud-based applications and on-premises systems.

●Collaborated with business units to define data models and ETL strategies specific to Dynamics AX, ensuring the integration of this critical business data into the broader analytics framework.

●Developed applications for real-time data streaming using Node.js, such as handling data from IoT devices or other streaming sources.

●Leveraged the Spark operator or native Kubernetes support for managing Spark applications.

●Regularly reviewed and updated business continuity plans to align with evolving business requirements, technological advancements, and potential risks.

●Responsible for converting row-like regular hive external tables into columnar snappy compressed parquet tables with key-value pairs

●Loaded the data into Spark RDD and do in memory data Computation to generate the Output response.

Environment: Spark SQL, HDFS, Hive, Pig, devops, Apache Sqoop, Java (JDK SE 6, 7), Scala, Shell scripting, Linux, MySQL Oracle Enterprise DB, PostgreSQL, IntelliJ, CI/CD, Oracle, Subversion, Control-M, Teradata, and Agile Methodologies, AWS, AWS Redshift.

Role: Data Engineer

Client: Mayo Clinic Rochester MN November 2017 to April 2019

Responsibilities:

●Developed Spark applications using Spark - SQL in Databricks for data extraction, transformation, and aggregation from multiple file formats for analyzing & transforming the data to uncover insights into the customer usage patterns.

●Involved in building and architecting multiple Data pipelines, end to end ETL and ELT process for Data ingestion and transformation in GCP and coordinating tasks among the team.

●Build, create and configure enterprise level Snowflake environments. Maintain, implement, and monitor Snowflake Environments.

●Proficient in utilizing Databricks Unified Analytics Platform for large-scale data processing and analytics tasks.

●Ensured compliance with healthcare regulations such as HIPAA (Health Insurance Portability and Accountability Act) by implementing data security measures, access controls, and data anonymization techniques to safeguard patient privacy and confidentiality.

●Developed Spark programs using Scala to compare the performance of Spark with Hive and SparkSQL.

●Designed and optimized data warehouses on GCP using BigQuery, facilitating efficient analytics and reporting.

●Executed advanced SQL queries in GCP BigQuery, enhancing data processing capabilities.

●Developed a detailed project plan and helped manage the data conversion migration from the legacy system to the target snowflake database.

●Proficient in utilizing SSIS for designing, developing, and implementing robust ETL processes to extract data from heterogeneous sources, transform it based on business rules or requirements, and load it into target databases or data warehouses efficiently.

●Optimized SQL queries and performance tuning techniques to enhance the speed and efficiency of data retrieval for Medicare Advantage claims analysis and reporting.

●Designed and implemented data pipelines to ingest, process, and analyze healthcare claims data from multiple sources, ensuring data accuracy, integrity, and compliance with industry standards

●Experienced in integrating GraphQL APIs with data lake technologies such as Apache Hadoop, Apache Spark, and Amazon S3 to streamline data ingestion, processing, and retrieval processes.

●Collaborated with SAP functional teams to understand business requirements and translate them into technical solutions using SAP technologies.

●Leveraged PL/SQL for complex data analysis tasks, supporting the generation of insightful reports and analytics.

●Collaborated with clinicians and healthcare professionals to design and implement data solutions that provide real-time clinical decision support, leveraging predictive analytics and machine learning models to identify potential health risks and optimize treatment strategies.

●Developed and implemented IAM policies to manage access controls and permissions for data engineering infrastructure, ensuring least privilege access.

●Proficient in optimizing ETL processes for performance and scalability using ODI performance tuning techniques, parallel processing, and efficient data loading strategies.

●Developed scalable data architectures and infrastructure solutions to support the growing volume and complexity of healthcare data, optimizing performance and resource utilization for data processing tasks.

●Created applications using Kafka, which monitors consumer lag within Apache Kafka clusters.

●Responsible for importing real time data to pull the data from sources to Kafka clusters.

●Introduced performance-enhancing techniques, such as bulk processing, to optimize PL/SQL code execution.

●Developed PySpark script to encrypt the raw data by using Hashing algorithms concepts on client-specified columns.

●Used Spark-Streaming APIs to perform necessary transformations and actions on the data got from Kafka.

●Got involved in migrating on prem Hadoop system to using GCP (Google Cloud Platform).

●Migrated MapReduce jobs into Spark jobs and used SparkSQL and Data frames API to load structured data into Spark clusters.

●Worked on a direct query using PowerBI to compare legacy data with the current data and generated reports and stored dashboards

●Building multiple Data pipelines, end to end ETL and ELT process for Data ingestion and transformation in GCP and coordinate tasks among the team.

●Implemented Python scripts for real-time data processing, handling streaming data from sources like Kafka or Apache Flink.

●Utilized Spring Boot's lightweight architecture to create scalable and modular data processing components.

●Worked on analyzing Hadoop clusters and different big data analytic tools including Pig, Hive.

●Created Kinesis Data streams, Kinesis Data Firehose and Kinesis Data Analytics to capture and process the streaming data and then output into S3, DynamoDB .

●Used cloud shell SDK in GCP to configure the services Data Pros, Storage, BigQuery.

●Migrated previously written cron jobs to airflow/composer in Google Cloud Platform.

●Led client operations involving data integration, ensuring seamless communication and collaboration between client requirements and data engineering solutions.

●Developed complex SQL queries, stored procedures, and functions in SAP HANA to perform data transformations, aggregations, and calculations.

●Deployed Dbt models and transformations to production environments using version control systems and continuous integration/continuous deployment (CI/CD) pipelines, enabling seamless and reliable deployment of data artifacts.

●Support existing GCP Data Management implementations.

●Utilized Spring Boot's logging capabilities for effective troubleshooting and performance analysis.

●Created GCP Big Query authorized views for row level security or exposing the data to other teams.

●Used ETL to implement the Slowly Changing Transformation, to maintain historical data in the Data warehouse.

●Performed ETL testing activities like running the Jobs, Extracting the data using necessary queries from database transform, and upload into the Data warehouse servers.

●Utilized GitHub Actions for managing dependencies

Contact this candidate