SUSHMA REDDY
Houston, TX
********************@*****.***
PROFESSIONAL SUMMARY
I have been a data engineer for more than 3+years, and I have built, tested, integrated, managed, and optimized data from several sources.
• Expert in creating and putting into practice end-to-end ETL procedures that make effective use of technologies like Snowflake, Snow SQL, and Snow Pipe for data loading and transformation.
Optimal performance and scalability are guaranteed by a wealth of experience with data warehousing principles, such as OLTP, OLAP, Dimensions, Facts, and data modeling.
• Practical experience with Snowflake integration setup elements for MS Azure, such as optimizer, metadata management, and job scheduling for efficient data processing.
• Expert in creating and managing data integration workflows using SSIS and MS SQL and T-SQL for database querying and manipulation.
• Comfort with utilizing AWS services to process data seamlessly, such as Glue, Redshift, RDS, S3, Data Pipeline, AppSync, and Lambda.
Strong understanding of ETL processes, data governance, and data modeling principles, ensuring data accuracy, consistency, and compliance with regulatory requirements.
Expertise in complex data processing and analytics using Azure SQL Server, Databricks, Row-level security, Databricks API, and Azure Data Lake Gen2.
Has experience working with a variety of platforms, such as SQL Server, PostgreSQL, MySQL, AWS Aurora, AWS RDS, AWS DynamoDB, Azure SQL, Azure Cosmos DB, AWS Redshift, AWS Lake Formation, and Azure Synapse Analytics, to design and develop relational and non-relational databases, data warehouses, and data lakes.
Expert in using tools like Tableau, Power BI, and Apache Superset to streamline data entry, exploration, and visualization procedures to produce actionable insights and facilitate decision-making.
Well-versed in CI/CD techniques for automating deployment procedures and guaranteeing the dependability and quality of software.
Proficient in Python, Java, Scala, and SQL for work involving data analysis, transformation, and manipulation.
TECHNICAL SKILLS
SQL Server / Azure SQL Database
My SQL, Oracle
Azure Synapse Analytics
Performance Tuning & Optimization
Data modeling, Data Processing, OLTP/OLAP
Azure Data Factory / ETL / ELT / SSIS
Azure Data Lake Storage
Azure Databricks
Storage Azure Synapse
SSRS / Power BI / Tableau
Apache Spark
Python, SQL, Scala
Methodology, Waterfall model
PROFESSIONAL EXPERIENCE
Baylor Scott & White Health Texas OCT 2023 – Present
Azure Data Engineer
Designed and implemented data models for various business units using Power BI, resulting in improved data accuracy and accessibility.
●Write complex Hive, Hbase, Data stage queries to load and process data in Hadoop File System and performance tuning.
Loaded data into Spark Data Frames and used Spark-SQL to explore data insights.
Read the files from source (S3) and do the code changes in Job modules and its dependent tables as per the business user need.
Led the development of a centralized data warehouse using SAS, consolidating data from multiple sources for unified reporting.
Optimized ETL processes in SAS to improve data processing efficiency and reduce load times by 25%.
Designed and implemented data pipelines using Azure Synapse Pipelines, improving data ingestion and transformation efficiency.
Optimized data warehousing solutions on Azure Synapse, leading to a [specific percentage] improvement in query performance and cost reduction.
Developed and executed complex T-SQL queries to analyze and manage large datasets within Synapse.
Leveraged Apache Spark within Synapse to process and analyze big data, enhancing data processing speed and accuracy.
Managed and monitored data workflows using Synapse Studio, ensuring seamless data integration and high data quality.
Designed and implemented data solutions using OCI services like Oracle Autonomous Data Warehouse, Oracle Big Data Service, Oracle Data Integration, and Oracle Analytics Cloud.
Goldman Sachs Texas Aug 2022 – OCT2023
Azure Data Engineer
Design and implement data storage solutions using Azure services such as Azure SQL Database, Azure Cosmos DB, and Azure Data Lake Storage
Developed PySpark scripts from source system like Azure Event Hub to ingest data in reload, append, and merge modeling into Delta tables in Databricks.
Familiarity with Azure Data Explorer (ADX) for real-time analytics and monitoring of streaming data sources in Azure environments.
Demonstrated expertise in optimizing performance on both Teradata and Snowflake SME platforms, utilizing advanced techniques such as query tuning, indexing strategies, and resource allocation to enhance query performance and reduce latency.
Writing PySpark and spark SQL transformation in Azure Databricks to perform complex transformations for business rule implementation
Implemented data integration pipelines between Snowflake Sand various source systems, orchestrating data ingestion and transformation processes for real-time and batch processing scenarios.
Performed data modeling, indexing, and tuning in Db2 to support business analytics and reporting needs.
Integrated Cognos with various data sources to create comprehensive business intelligence solutions.
Deployed and managed data solutions on IBM Cloud, ensuring scalability and reliability for cloud-based applications.
Capgemini India JULY 2020 – JULY 2021
Azure Data Engineer
Leveraged Databricks extensively in conjunction with Azure Data Factory (ADF) to process large volumes of data efficiently.
Executed ETL operations within Azure Databricks, establishing connections to diverse relational database source systems through JDBC connectors
Developed Python scripts within Databricks for file validations and orchestrated automation of these processes using ADF.
Engineered an automated Azure cloud process for daily data ingestion from web services, seamlessly loading it into Azure Data Lake Gen2.
Conducted data analysis directly within its residing environment by Mounting Azure Data Lake and Blob to Databricks.
Employed Logic App to facilitate decision-making actions within the workflow.
Engineered custom alerts utilizing Azure Data Factory, SQLDB, and Logic App.
Created and maintained documentation in Confluence.
Used GIT and Bitbucket for version control and collaborative software development.
Constructed Databricks ETL pipelines using notebooks, Spark DataFrames, SPARK SQL, and Python scripting.
Formulated intricate SQL queries involving stored procedures, common table expressions (CTEs), and temporary tables to underpin Power BI reports.
Collaborated with the enterprise Data Modeling team to craft Logical models.
Demonstrated proficiency in Microsoft Azure, furnishing data movement and scheduling functionalities for cloud-based technologies like Azure Blob Storage and Azure SQL Database.
Crafted JSON Scripts for deploying Pipelines in Azure Data Factory (ADF) to process data efficiently.
Independently managed the development of ETL processes from inception to delivery.
Engineered Pipelines in ADF utilizing Linked Services, Datasets, and Pipelines to Extract, Transform, and Load data from various sources such as Azure SQL, Blob storage, Azure SQL Data Warehouse, and write-back tools.
Migrated on-premises data (Oracle/SQL Server/DB2/MongoDB) to Azure Data Lake Store (ADLS) utilizing Azure Data Factory (ADF V1/V2).
Designed and deployed numerous ETL workflows via Azure Data Factory (ADF) and SSIS packages, facilitating the extraction, transformation, and loading of data from SQL Server databases, Excel, and flat file sources into Data Warehouses.
Proficient with Oracle versions 11gR2, 12c, and 19c in a Linux environment.
Experienced in Oracle Real Application Cluster (RAC), Automatic Storage Management (ASM), Active Data Guard, Oracle Enterprise Manager (OEM), and Golden Gate replication.
Automated workflows and processes using various tools and scripts.
Collaborated with cross-functional teams to deliver robust data solutions.
Recreated existing application logic and functionality within the Azure Data Lake, Data Factory, SQL Database, and SQL Data Warehouse environment.
Created Notebooks in Azure Databricks and seamlessly integrated them with ADF to automate processes efficiently.
Utilized Azure Data Factory to orchestrate Databricks data preparation processes and subsequently load them into SQL Data Warehouse.
EDUCATION
Master’s in computer information systems from Lindsey Wilson college .