GAYATHRI M
Email: Mobile: +*
PROFESSIONAL SUMMARY
●5+ years of experience as a Data Engineer specializing in building scalable data pipelines, architectures, and analytics solutions for industries like healthcare, finance, and insurance.
●Proficient in AWS and Azure, designing cloud-based solutions with tools such as Glue, Redshift, Lambda, Athena, Data Factory, and Synapse to streamline data processing and enable real-time analytics.
●Expert in ETL pipeline design using AWS Glue, Azure Data Factory, Databricks, and PySpark, optimizing data flow, ensuring quality, and improving integration efficiency.
●Skilled in Big Data and streaming technologies like Hadoop, Spark, Kafka, and Kinesis, processing large datasets and enabling real-time insights for critical business applications.
●Experienced in database management and optimization, focusing on data modeling, Redshift, SQL Server, and Azure SQL to enhance performance and scalability.
●Strong background in integrating machine learning models, utilizing AWS SageMaker and Azure ML to develop predictive models that drive business decision-making and improve operational efficiency.
TECHNICAL SKILLS
Cloud Platforms
AWS (S3, Redshift, Lambda, Glue, Athena, Kinesis, CloudWatch, Step Functions, EC2, RDS, SageMaker, IAM, EMR, VPC, EKS, QuickSight), Azure (Data Factory, Synapse Analytics, SQL, ML, Blob Storage, Functions, Active Directory, VM, VNet, AKS, Event Hubs, Data Lake, Azure DevOps)
Data Engineering & ETL Tools
AWS Glue, Azure Data Factory, SSIS, SSAS, SSRS, Databricks, PySpark, DBT, Hadoop, Hive, Spark, Change Data Capture (CDC)
Big Data & Streaming
Hadoop, Spark, Kafka, Kinesis, Azure Event Hubs, Azure Stream Analytics
Database & Data Warehousing
AWS Redshift, Azure SQL, SQL Server, NoSQL (DynamoDB), Data Lakes, Data Warehouses, Data Models (Star, Snowflake), OLAP Cubes
Machine Learning & AI
AWS SageMaker, Azure ML, Databricks, TensorFlow, Scikit-learn, PyTorch
CI/CD & DevOps
Jenkins, Git, Docker, Azure DevOps, AWS CodePipeline, GitHub Actions, Terraform, CloudFormation
Security & Compliance
AWS IAM, Azure Active Directory, VPC, Encryption (at Rest & in Transit), SOC 2, HIPAA, Data Masking, Data Security Policies
Business Intelligence & Reporting
Tableau, Power BI, AWS QuickSight, Power BI Embedded
Monitoring & Troubleshooting
AWS CloudWatch, Azure Monitor, Datadog, Sumo Logic, CloudTrail, Log Analytics
API Design & Integration
RESTful APIs, GraphQL, API Management, API Gateways, Lambda Functions, API Security (OAuth)
PROFESSIONAL EXPERIENCE
Sun Life Boston, MA
Data Engineer May 2024 - Present
●Designed and implemented scalable ETL solutions using AWS Glue, Lambda, and Redshift, processing 5TB+ of daily data for seamless integration and analytics across business units.
●Optimized Glue job performance, achieving a 40% reduction in execution times through partitioning and parallel data transformations, improving the speed of large-scale data processing.
●Engineered data pipelines with AWS S3, Redshift, and Athena, streamlining the process for real-time and historical analytics, resulting in reduced latency and enhanced performance.
●Built real-time data streaming solutions using AWS Kinesis, processing 2M+ events per minute, enabling real-time insights into customer behavior and operational data.
●Architected and maintained a high-performance Redshift warehouse, optimizing schema and indexing to achieve a 30% improvement in query response times and reporting accuracy.
●Implemented robust security protocols leveraging IAM, VPC, and encryption, ensuring SOC 2 and HIPAA compliance for secure data storage and processing.
●Troubleshot and resolved production data issues, reducing downtime by 25% with proactive monitoring through AWS CloudWatch, improving overall system reliability.
●Automated deployment pipelines using AWS CodePipeline and Jenkins, speeding up release cycles by 35% and ensuring continuous integration for better operational agility.
●Developed comprehensive data lineage frameworks to ensure full visibility of data flow from ingestion through transformation to reporting, improving traceability and auditability.
●Enhanced Redshift query performance, boosting daily report generation efficiency by 50% through advanced query optimization, including distribution key design and sort key strategies.
●Integrated machine learning models with SageMaker, increasing predictive analytics capabilities for customer engagement and improving personalization strategies by 20%.
●Streamlined large-scale data migration from on-premises to AWS cloud, reducing transfer times by 45% with AWS Data Migration Service and optimizing data movement efficiency.
●Optimized ETL workflows, reducing processing times of complex data jobs by 25% through fine-tuning AWS Glue, leveraging parallel processing and optimized data partitions.
●Implemented monitoring and alerting systems through CloudWatch, enabling faster issue detection and resolution, reducing operational disruptions by 30%.
●Reduced operational costs by 20% through effective use of S3 lifecycle policies and Redshift Spectrum, cutting down unnecessary storage and computing expenditures.
Cigna Health Bloomfield, CT
Data Engineer May 2021 - Aug 2023
●Built and maintained ETL pipelines with AWS Glue, Lambda, and S3, processing 3TB+ of healthcare data monthly, optimizing data flow between various health systems for analytics.
●Designed optimized data models in AWS Redshift, enabling real-time reporting on 150+ healthcare metrics and reducing query retrieval times by 35% for more efficient analytics.
●Enhanced Redshift performance, increasing query speed by 50% through strategic partitioning and indexing, ensuring fast and reliable access to data across healthcare departments.
●Developed real-time data processing solutions with AWS Kinesis, handling 1.5M+ events daily to support timely decision-making in patient care and operational management.
●Leveraged Athena for serverless querying, reducing costs by 40% while enabling fast, ad-hoc analysis of 10TB+ of patient data stored in S3 for data-driven decision-making.
●Managed RDS instances, ensuring 99.99% uptime for critical transactional workloads such as patient registration and billing, improving reliability and system performance.
●Automated CI/CD deployment pipelines using AWS CodePipeline, reducing the time for updates and new releases by 40%, ensuring faster access to critical fixes and features.
●Ensured HIPAA compliance by implementing encryption protocols, IAM roles, and VPC configurations across AWS services, safeguarding patient data and meeting regulatory requirements.
●Migrated legacy healthcare systems to AWS, reducing operational costs by 30% while ensuring scalability and security for data storage and processing in the cloud.
●Implemented real-time data pipelines using AWS Kinesis, enabling up-to-the-minute insights into patient data for clinicians, improving operational decision-making speed.
●Improved batch ETL jobs by optimizing AWS Glue jobs, reducing job execution times by 25%, allowing more efficient processing of data for reporting and analytics.
●Implemented cost-saving strategies for Redshift and S3 storage, decreasing storage costs by 18% through better data compression and optimizing data retrieval processes.
●Developed machine learning models using SageMaker, predicting patient health outcomes with 90% accuracy, enabling more effective health interventions.
●Streamlined ETL workflows with AWS Step Functions, reducing manual interventions by 20% and improving operational efficiency by automating complex data processes.
●Automated data quality checks within ETL pipelines, decreasing data errors by 15% and ensuring data consistency for reporting and analytics.
Yes Bank Hyderabad, India
Data Engineer Jul 2019 - Apr 2021
●Designed and implemented data pipelines using Azure Data Factory and Databricks, processing 4TB+ of financial data monthly and enabling seamless integration for real-time analytics across departments.
●Optimized data storage and performance within Azure Synapse Analytics, reducing query execution times by 40% and enhancing the bank’s reporting capabilities for financial data insights.
●Developed real-time data streaming solutions with Azure Event Hubs and Stream Analytics, processing 500K+ events per day to monitor transaction flows and detect fraud patterns instantly.
●Built serverless architectures using Azure Functions, reducing infrastructure overhead by 25% and improving the scalability and agility of data processing workflows.
●Implemented data security and compliance protocols, ensuring the protection of sensitive banking information with Azure Active Directory, encryption, and role-based access control, meeting financial industry standards.
●Automated data integration and migration from legacy on-premises systems to Azure, cutting down data transfer times by 30% and reducing operational costs by 20% while ensuring data integrity.
●Engineered and optimized batch processing pipelines with Azure Data Factory, reducing ETL job runtimes by 35% and improving operational efficiency for daily transaction processing.
●Improved database performance by tuning Azure SQL Database, boosting transaction processing speeds by 30%, and enhancing real-time analytics for customer transaction data.
●Collaborated with cross-functional teams to design data solutions that supported the bank’s core systems, improving decision-making and customer insights through advanced reporting and predictive analytics.
●Developed and deployed machine learning models in Azure ML, forecasting customer behaviors and financial trends, enabling more targeted marketing campaigns with a 90% accuracy rate.
●Implemented monitoring and alerting solutions with Azure Monitor, reducing troubleshooting times by 25% and ensuring the bank’s systems met high availability and performance standards.
●Optimized cost management strategies, leveraging Azure Cost Management to reduce cloud spending by 20% while maintaining high performance and scalability across the bank’s data infrastructure.
●Integrated APIs and microservices for smooth data exchange between internal banking systems and external partners, enhancing customer service and data accessibility.
●Led the migration of legacy data systems to Azure cloud services, improving scalability, data availability, and security, and reducing infrastructure-related downtime by 40%.
●Automated regular reporting with Azure Logic Apps, saving 25% in manual effort and ensuring timely delivery of financial reports to stakeholders.
EDUCATION
Belhaven University Jackson, MI
Master of Science in Computer Science