Sr. Data Engineer
Name: Sravani M
Email: ************@*****.***
Phone: 928-***-****
Professional Summary:
Seasoned Data Engineer with over 10 years of experience in designing, building, and maintaining scalable and robust data solutions across GCP, AWS, and Azure platforms.
Expertise in big data technologies including Hadoop, Spark, and EMR, proficient in deploying large-scale data processing pipelines that improve data accessibility and utility.
Skilled in cloud services such as AWS Redshift, S3, and GCP BigQuery, with a strong track record of optimizing storage and enhancing data retrieval processes for business intelligence.
Proficient in data integration tools like Informatica Power Center 8.1 and Talend, specializing in complex ETL processes and data workflow automation.
Advanced knowledge of database management using SQL Server, Oracle 11g, and Azure SQL, ensuring data integrity and performance in high-volume environments.
Experienced in data modeling and schema design with tools such as Erwin Data Modeler and Excel, providing a solid foundation for analytics and reporting.
Developed and managed real-time data streaming applications using technologies like Spark Streaming, Kinesis, and Cloud Pub/Sub, enhancing operational efficiency and decision-making speed.
Implemented comprehensive data security measures using Azure AD, Key Vault, and SSH, safeguarding sensitive information in compliance with industry standards.
Utilized modern data visualization tools such as Power BI and Tableau, delivering compelling insights that drive business strategies and outcomes.
Collaborative team leader and mentor, training teams in best practices for data handling, scripting in Python and Shell, and using SQL developer tools for optimal results.
Technical Skills:
Category
Skills
Google Cloud Platform (GCP)
GCP, GCS, GCP BigQuery, GCP Dataprep, GCP Dataflow, Cloud Composer, Cloud Pub/Sub, Cloud Storage Transfer Service, Cloud Spanner, Cloud SQL, Data Catalog, GCP Databricks, GCP DataProc Big Query, PySpark
Amazon Web Services (AWS)
AWS Redshift, AWS S3, AWS Data Pipelines, Snowflake, EC2, RDS, DynamoDB, AWS Glue, EMR, Kinesis
Azure
Azure VM creation, ACR, Azure Function App, Azure WebApp, Azure SQL, Azure SQL MI, Azure DevOps, Azure AD, Azure Service Bus, Cosmos DB, Log Analytics, AKS, Event Hub, Service Bus, Key Vault, App Insights
Data Analysis & ETL
SQL, Hive, Sqoop, Teradata, Hadoop, Spark, Spark Streaming, Scala, Informatica, Talend, Fivetran, Control-M, oozie, Informatica Power Center 8.1
Database Management
SQL Database, SQL Server, Oracle 12c, Dynamo DB, Erwin Data Modeler
Scripting & Development
Python, Shell scripting, YAML, Git, Maven, Jira, Jenkins, Ansible, Shell Scripting
Data Visualization & Reporting
Power BI, Tableau, Excel
Security & Operations
SSH, Azure AD, Log Analytics, Key Vault, Red Hat Linux, MS Azure
Tools & Utilities
Toad for Oracle, SQL developer
Professional Experience
Truist Bank, Charlotte, NC
GCP Data Engineer March 2023 to Present
Responsibilities:
Implemented GCP BigQuery solutions for complex data analysis and reporting, enhancing data-driven decision-making processes.
Engineered robust data pipelines using GCP Dataflow and Cloud Pub/Sub, streamlining data ingestion and real-time analytics.
Developed and maintained scalable data architectures, utilizing GCP technologies including GCS and Cloud Spanner, ensuring high availability and performance.
Configured and managed GCP Databricks environments to perform complex data processing tasks, significantly reducing processing times.
Optimized data storage and retrieval by integrating Cloud SQL and Snowflake, improving query performance and system scalability.
Designed data integration solutions with GCP Dataprep and Dataflow, automating data cleansing and transformation processes.
Led migration projects from on-premises Hadoop clusters to GCP, leveraging GCP DataProc for seamless data transfer and integration.
Utilized Python and PySpark for scripting and data manipulation tasks, enhancing automation and reducing manual intervention.
Created data visualization tools using Power BI, providing actionable insights through dynamic dashboards and reports.
Administered Cloud Storage Transfer Service to efficiently manage cross-platform data synchronization and backups.
Designed and implemented security measures for data protection using GCP’s Data Catalog and IAM policies, ensuring compliance with financial regulations.
Automated workflows with Cloud Composer, optimizing scheduling and management of batch and streaming data jobs.
Conducted performance tuning of SQL databases and GCP environments, ensuring optimal use of resources and faster data access.
Developed custom ETL frameworks using Sqoop and SAS, facilitating efficient data extraction, transformation, and loading.
Collaborated with cross-functional teams to identify and resolve data integration issues, improving data quality and operational efficiency.
Documented all data engineering processes and updates in project documentation, maintaining clear and accurate records for compliance and auditing purposes.
Environment: GCP, GCS, GCP BigQuery, GCP Dataprep, GCP Dataflow, Cloud Composer, Cloud Pub/Sub, Cloud Storage Transfer Service, Cloud Spanner, Cloud SQL, Data Catalog, GCP Databricks, GCP, PySpark, SAS, Hive, Sqoop, Teradata, GCP DataProc Big Query, Hadoop, Hive, GCS, Python, Snowflake, Power Bi, SQL Database, Data Bricks, Snow flake.
Ascena Retail Group, Patskala, Ohio
AWS Data Engineer June 2020 to February 2023
Responsibilities:
Configured AWS Data Pipelines for automated data transfer and transformation, significantly reducing manual efforts and improving data flow efficiency.
Integrated Snowflake with AWS services to leverage cloud scalability and performance, improving data querying and reporting functionality.
Managed big data processing using Hadoop YARN, ensuring efficient resource allocation and job scheduling across the cluster.
Designed and maintained SQL Server databases, optimizing structures and indexes for improved query performance and system responsiveness.
Implemented real-time data processing using Spark Streaming and Kinesis, facilitating instant data analysis and decision-making.
Engineered data warehousing solutions on AWS Redshift, optimizing data aggregation and retrieval processes, enhancing analytical capabilities.
Developed and managed AWS S3 buckets for scalable data storage, implementing security measures to ensure data integrity and accessibility.
Automated data processes using AWS Glue, simplifying ETL operations and maintaining high data quality through continuous synchronization.
Developed applications in Scala on Spark, enhancing data processing capabilities and providing robust back-end solutions.
Administered AWS EC2 instances, configuring and optimizing computing resources to meet project demands and cost efficiency.
Optimized database management on AWS RDS and DynamoDB, ensuring high availability and performance for transactional data.
Managed Oracle 12c databases, implementing best practices for database design and performance tuning.
Scripted automation tools in Python, improving operational workflows and reducing downtime through effective maintenance scripts.
Utilized Hive and Sqoop for data manipulation and integration, enhancing data accessibility across Hadoop and relational databases.
Developed data visualization solutions using Tableau, providing compelling insights through interactive dashboards and reports.
Implemented data integration projects using Talend and Informatica, aligning data sources with business needs to support strategic initiatives.
Environment: AWS Redshift, AWS S3, AWS Data Pipe Lines, Snowflake, Hadoop YARN, SQL Server, Spark, Spark Streaming, Scala, Kinesis, EC2, RDS, Dynamo DB Oracle 12c, AWS Glue, Python, Hive, Linux, Sqoop, Informatica, Tableau, Talend, Cassandra, oozie, Control-M, Fivetran, EMR.
Mayo Clinic Rochester MN
Azure Data Engineer February 2018 to May 2020
Responsibilities:
Deployed Azure VMs for scalable healthcare data processing applications, enhancing system reliability and performance.
Configured Azure Container Registry (ACR) to manage and scale Docker container images, improving deployment processes and version control.
Developed Azure Function Apps to automate data processing tasks, reducing processing time and improving system responsiveness.
Managed Azure WebApps for hosting patient data dashboards, ensuring high availability and secure data access.
Administered Azure SQL and Azure SQL Managed Instance, optimizing database performance and security for sensitive healthcare data.
Implemented secure SSH configurations to ensure safe data transfers and system access, adhering to healthcare compliance standards.
Authored YAML scripts for configuration management, facilitating consistent deployments across development and production environments.
Maintained WebLogic servers, configuring and tuning environments to support enterprise healthcare applications.
Utilized Python for data scripting and automation, enhancing data workflows and analytical processes.
Integrated Azure DevOps pipelines for CI/CD, streamlining code deployment and quality assurance.
Managed source control with Git, ensuring version control and collaboration across distributed development teams.
Configured Jenkins and Maven for build and release management, automating code compilation and deployment processes.
Implemented Ansible for automation of configuration and orchestration tasks, reducing manual setup and maintenance.
Secured applications using Azure AD and Key Vault, safeguarding sensitive data and managing secrets efficiently.
Monitored applications using Azure App Insights and Log Analytics, providing real-time analytics to track performance and system health.
Architected solutions using Azure Service Bus and Event Hub, facilitating reliable message queuing and real-time data streaming for operational insights.
Environment: Azure VM creation, ACR, Azure Function App, Azure WebApp, Azure SQL, and Azure SQL MI, SSH, YAML, WebLogic, Python, Azure DevOps, Git, Maven, Jira, Azure, Red Hat Linux, MS Azure, Jenkins, Ansible, Shell Scripting, Azure AD, Azure Service Bus, Azure SQL, Cosmos DB, Log Analytics, AKS, Event Hub, Service Bus, Key Vault, App Insights.
OJAS Innovative Technologies
Data Analyst May 2015 to October 2017
Responsibilities:
Developed complex SQL queries using Toad for Oracle, enhancing data retrieval processes and supporting advanced data analysis.
Automated repetitive data processes with UNIX shell scripting, improving efficiency and accuracy of daily data operations.
Managed database schema and data integrity using Erwin Data Modeler, ensuring alignment with business requirements and data standards.
Created detailed reports and dashboards in Excel, providing actionable insights through advanced data visualization techniques.
Optimized data transformations using Informatica Power Center 8.1, streamlining ETL processes and enhancing data warehouse reliability.
Administered Oracle 11g databases, performing maintenance and updates to support business continuity and data accuracy.
Developed and maintained documentation for database designs and changes, ensuring clarity and compliance with software development standards.
Collaborated with development teams to integrate database changes into application development, enhancing data-driven functionalities.
Conducted data quality audits, identifying and correcting discrepancies, thus ensuring the reliability of business reports.
Trained junior analysts on SQL Developer and SQL best practices, fostering a knowledgeable and efficient team environment.
Environment: Toad for Oracle, SQL, UNIX, Shell scripting, SQL developer, Erwin Data Modeler, Excel, Informatica Power Center 8.1, Oracle 11g.