Snowflake Developer

Location:

Dover, DE, 19904

Posted:

March 24, 2025

Contact this candidate

Resume:

Anil Potru

Senior Snowflake Data Engineer

***********@*****.***

757-***-****

www.linkedin.com/in/anil-potru

Professional Summary:

Experienced Senior Snowflake Data Engineer with 10+ years of expertise in designing and optimizing scalable Snowflake-based data pipelines, ETL/ELT workflows and cloud-native data architectures across industries such as finance, retail and healthcare. Proficient in Snowflake, dbt, Python (Pandas, PySpark) and cloud platforms (AWS, Azure) with strong knowledge of distributed computing, security best practices and AI/ML-driven automation for enterprise data solutions.

Snowflake Data Engineering & ELT Development – Expertise in building real-time and batch data pipelines using Snowflake Snowpark, Snowpipe, dbt, AWS Glue, Apache Airflow and SQL UDFs, ensuring efficient data transformation, cost optimization and seamless integration across cloud environments.

Snowflake Performance Optimization & Database Management – Extensive experience managing Snowflake warehouses, multi-cluster architecture, Materialized Views, Query Acceleration Service and Amazon RDS (PostgreSQL, MySQL) with a focus on query optimization, indexing, partitioning and result caching to enhance performance.

Data Warehousing & Real-Time Analytics – Proficient in designing modern cloud-based data warehouses using Snowflake, Redshift and BigQuery, supporting real-time analytics, predictive modeling and AI-driven insights for healthcare, finance and supply chain operations.

Real-Time Data Streaming & Event Processing – Skilled in Kafka, Snowpipe, Snowflake Streams, Kinesis and AWS Lambda for low-latency, high-throughput real-time data ingestion and processing, enabling seamless event-driven architectures.

Cloud Infrastructure & Security – Hands-on experience managing Snowflake environments in AWS and Azure, ensuring scalability, reliability and compliance with RBAC, column masking, row-level security (RLS), multi-factor authentication (MFA) and data encryption (AWS KMS, Snowflake Dynamic Data Masking).

API Development & Data Integration – Expertise in developing secure and scalable API-based data integrations using FastAPI, AWS API Gateway, Snowflake Connector for Python and PyODBC, ensuring data accessibility, governance and microservices architecture.

CI/CD & Infrastructure Automation – Implementing automated Snowflake deployments using GitHub Actions, dbt Cloud, Terraform and AWS CodePipeline, streamlining schema evolution, testing and secure data migrations for enterprise data platforms.

Machine Learning & AI-Driven Analytics – Applying Snowflake AI Functions, Snowpark for Python, Amazon SageMaker and LLMs (Snowflake Cortex) to enhance predictive analytics, fraud detection, anomaly detection and intelligent automation across data-driven systems.

Performance Monitoring & Cost Optimization – Utilizing Snowsight, Snowflake Resource Monitors, AWS CloudWatch, Prometheus and Grafana for proactive query performance tuning, warehouse scaling and cost-efficient resource allocation.

Data Governance & Compliance – Ensuring HIPAA, PCI-DSS, SOC 2 and GDPR compliance through Snowflake’s governance frameworks, data lineage tracking, audit logging, secure data sharing and access controls.

Data Visualization & Business Intelligence – Designing and delivering real-time dashboards and reporting solutions using Amazon QuickSight, Tableau and Power BI, enabling actionable insights and strategic decision-making for business stakeholders.

Collaboration & Agile Methodologies – Experienced in Agile, Scrum and Kanban environments, working with cross-functional teams to drive data initiatives, manage large-scale cloud migrations and implement Snowflake best practices for enterprise data solutions.

Education:

Bachelors in Engineering, India (May 2010).

Certifications:

AWS certified solutions Architect associate.

Microsoft Certified Azure Data Engineer Associate.

Snowflake SnowPro Certification Core.

Professional Track Record:

Client: JPMorgan Chase New York, NY Jan 2024 to Till Date

Senior Snowflake Data Engineer

Functional Role Details:

Led enterprise cloud migration initiatives for JPMorgan, modernizing legacy on-premise data platforms by transitioning to Snowflake on GCP, optimizing virtual warehouses, data sharing, and auto-clustering to enhance performance and scalability in the financial sector.

Led enterprise cloud migration initiatives for JPMorgan, modernizing legacy on-premise data platforms by transitioning to Snowflake on AWS, optimizing virtual warehouses, data sharing, and auto-clustering to enhance performance and scalability in the financial sector.

Designed and developed Snowflake ETL/ELT pipelines using Python (Pandas, PySpark), dbt, AWS Glue, and Snowflake Snowpark, ensuring efficient data ingestion, transformation, and orchestration with machine learning-driven anomaly detection for financial data integrity.

Migrated and optimized relational databases to Snowflake on AWS, leveraging Query Acceleration Service, Materialized Views, and Result Caching to improve query performance and reduce compute costs for financial transactions and reporting.

Redefined data warehousing strategies by shifting from legacy on-prem solutions to Snowflake’s multi-cluster architecture, enabling predictive analytics, time travel, and zero-copy cloning for efficient data versioning and rollback within financial systems.

Implemented real-time data streaming into Snowflake using AWS Kinesis, Snowpipe, and Snowflake Streams, incorporating AI-based event filtering to enhance real-time financial insights and trading data analytics.

Orchestrated data workflows using Apache Airflow (MWAA), dbt, and Snowflake Tasks, automating financial data transformations and scheduling for optimized pipeline execution.

Developed serverless data processing using Snowflake UDFs (User-Defined Functions) and UDTFs (User-Defined Table Functions), integrating AI-based automation to optimize execution paths for financial data processing.

Implemented role-based access control (RBAC), row-level security (RLS), and column-masking in Snowflake, ensuring enterprise-grade security and compliance with financial regulations such as PCI-DSS, SOX, and GDPR.

Optimized CI/CD pipelines for Snowflake using GitHub Actions, dbt, and Liquibase, integrating AI-based anomaly detection to enhance deployment stability and prevent schema drift in financial reporting systems.

Enhanced metadata management by centralizing datasets in Snowflake Information Schema, incorporating AI-based data classification models for improved data governance in financial data management.

Designed and implemented Snowflake Data Sharing and Data Exchange strategies, enabling secure cross-cloud data collaboration within JPMorgan’s ecosystem without traditional ETL processes, ensuring seamless integration with financial applications.

Automated infrastructure provisioning for Snowflake environments using Terraform and AWS CloudFormation, integrating AI-powered cost prediction models to optimize compute credits and storage allocation for financial workloads.

Migrated and modernized BI platforms by integrating Snowflake with QuickSight, Tableau, and Power BI, enabling AI-powered analytics and automated insights for business stakeholders in the financial domain.

Developed Snowflake-native ML models using Snowpark and Python UDFs, leveraging Snowflake’s processing power for data-driven decision-making in areas like fraud detection, risk assessment, and portfolio management.

Integrated monitoring and optimization tools such as Snowflake Resource Monitors, Snowsight, Prometheus, and Grafana, using AI-driven predictive analytics to identify query bottlenecks and optimize compute usage for financial queries and analytics.

Facilitated cross-functional collaboration by documenting Snowflake migration strategies, best practices, and architectural decisions using Confluence and Jira, leveraging AI-powered knowledge retrieval for financial project teams.

Streamlined batch and real-time data integrations by developing API-based connectivity between Snowflake and enterprise applications using Snowflake Connector for Python, PyODBC, and AWS API Gateway, incorporating AI-based data validation techniques for financial transactions.

Established disaster recovery strategies for Snowflake environments, integrating failover configurations, cross-region replication, and AI-driven risk assessment models for proactive security monitoring of financial data assets.

Led cost optimization for Snowflake workloads, leveraging Auto-suspend, Auto-resume, Compute Scaling, and Reserved Capacity, while integrating AI-powered cost optimization strategies to reduce expenses in JPMorgan’s financial data ecosystem.

Client: AbbVie Chicago, IL Sep 2021 to Dec 2023

Senior Snowflake Data Engineer

Functional Role Details:

Designed, developed, and maintained real-time Snowflake data pipelines using Python (NumPy, Pandas, and PySpark), dbt, and Snowflake Snowpark to process high-velocity healthcare transactions with low latency and high reliability.

Designed, developed, and maintained real-time Snowflake data pipelines using Python (NumPy, Pandas, and PySpark), dbt, and Snowflake Snowpark to process high-velocity healthcare and pharmaceutical transactions with low latency and high reliability.

Built and optimized real-time Snowflake data ingestion workflows using Snowpipe, Kafka, and AWS HealthLake, ensuring seamless and efficient processing of patient records, drug trial data, and claims transactions.

Integrated Snowflake with Amazon RDS (PostgreSQL & MySQL), AWS HealthLake, and AWS Glue using SQLAlchemy and PyODBC, ensuring secure and scalable database connectivity for EHR (Electronic Health Records) and clinical trials.

Architected and implemented scalable Snowflake data warehousing solutions, leveraging Time Travel, Zero-Copy Cloning, Materialized Views, and Multi-Cluster Warehouses to enable real-time patient analytics, compliance monitoring, and drug discovery.

Developed and deployed secure, scalable RESTful APIs using FastAPI and AWS API Gateway, ensuring efficient access to Snowflake data assets with role-based security policies for healthcare and pharmaceutical applications.

Automated real-time ETL/ELT workflows using AWS Glue, Apache Airflow (MWAA), and dbt, ensuring continuous, fault-tolerant data transformation for medical records, genomic sequencing, and clinical trial data.

Utilized Amazon S3 and Snowflake External Tables for secure, scalable storage of patient data, drug efficacy reports, and real-world evidence (RWE), ensuring HIPAA and FDA 21 CFR Part 11 compliance.

Processed and transformed large-scale healthcare datasets using Snowflake Snowpark, AWS Glue (Apache Spark), and SQL UDFs, enabling AI-driven diagnostics, patient risk stratification, and precision medicine.

Developed Snowflake-native machine learning models using Snowpark for Python and Snowflake UDFs, enabling predictive analytics in drug interactions, patient deterioration monitoring, and anomaly detection in clinical trials.

Designed and delivered interactive dashboards and visualizations using Power BI, Tableau, and AWS QuickSight, integrating real-time Snowflake queries for patient outcomes, drug efficacy, and hospital performance metrics.

Monitored and optimized real-time Snowflake data pipelines using Snowsight, Snowflake Resource Monitors, AWS CloudWatch, AWS HealthOmics, and Prometheus, ensuring performance, reliability, and anomaly detection for healthcare data pipelines.

Queried and analyzed real-time Snowflake data using Snowflake Query Acceleration Service and AWS HealthOmics, enabling distributed computing and real-time insights into genomic data and patient diagnostics.

Integrated AI-powered automation using AWS HealthScribe, Amazon Comprehend Medical, and Snowflake AI Functions to enhance medical text analysis, entity recognition, and automated claims processing.

Applied Prompt Engineering techniques to optimize Large Language Models (LLMs) with Snowflake Cortex, automating clinical documentation, medical coding, and regulatory reporting.

Implemented secure, scalable CI/CD pipelines for Snowflake deployments using AWS CodePipeline, GitHub Actions, and dbt Cloud, ensuring automated testing and deployment of real-time healthcare analytics solutions.

Developed disaster recovery and high-availability strategies for Snowflake, leveraging Failover Clustering, Cross-Region Replication, and Secure Data Sharing for backup and redundancy in pharmaceutical data systems.

Established Snowflake data governance frameworks by enforcing data lineage tracking, access controls, row-level security (RLS), and encryption using AWS Key Management Service (KMS) and Snowflake Dynamic Data Masking to ensure HIPAA, GxP, and GDPR compliance.

Collaborated with biotech, pharma, and healthcare teams, including engineers, data scientists, and regulatory experts, in Agile and Kanban environments, delivering scalable real-time Snowflake data solutions.

Optimized Snowflake performance and cost-efficiency using Auto-suspend, Auto-resume, Compute Scaling, Snowflake Cost Explorer, and AWS Auto-scaling policies, ensuring efficient cloud operations for pharma research and clinical data.

Client: - Walgreens Chicago, IL Oct 2018 to Aug 2021

Data Engineer

Functional Role Details:

Architected and optimized scalable data pipelines using Apache Spark and dbt to migrate and transform large-scale banking datasets from on-prem databases to Snowflake, ensuring high reliability and efficiency across workflows.

Architected and optimized scalable data pipelines using PySpark to migrate and transform large-scale healthcare datasets from on-prem databases to Snowflake, ensuring high reliability and efficiency across workflows.

Architected and optimized scalable data pipelines using Apache Spark and dbt to migrate and transform large-scale healthcare datasets from OracleDB to Snowflake, ensuring high reliability and efficiency across workflows.

Managed and enhanced relational databases such as Oracle DB, utilizing AWS DMS (Database Migration Service) and AWS RDS for seamless integration and efficient data replication to Snowflake.

Developed secure, high-performance RESTful APIs using Django, enabling real-time data exchange between inventory, pharmacy management systems, and healthcare applications post-migration.

Designed and implemented enterprise data warehousing solutions with Snowflake, supporting real-time decision-making, patient analytics, drug inventory management, and operational optimization in healthcare and pharma sectors.

Built and maintained real-time data streaming systems using Apache Kafka and AWS Kinesis, ensuring low-latency data flow for patient monitoring, drug supply chain tracking, and fraud detection across Snowflake.

Automated and optimized ETL pipelines using Apache Airflow, dbt, and AWS Glue, improving data ingestion, transformation, and distribution across diverse healthcare sources, including EHRs (Electronic Health Records) and pharma supply chain data.

Leveraged distributed computing frameworks like Apache Spark and AWS EMR to process large-scale healthcare data efficiently, enabling analytics for patient care outcomes, clinical trial analysis, and pharma operations post-migration.

Orchestrated and managed interdependent workflows with Apache Airflow, ensuring smooth execution of real-time patient data updates, drug inventory replenishment, and healthcare service management operations.

Developed interactive dashboards and visualizations using Tableau and Oracle BI, providing real-time business intelligence on patient outcomes, drug effectiveness, and healthcare operational performance.

Monitored and optimized system performance with Prometheus, AWS CloudWatch, and ELK Stack (Elasticsearch, Logstash, Kibana), ensuring system reliability, uptime, and compliance with healthcare data privacy and regulatory standards (HIPAA).

Developed and managed automated CI/CD pipelines using Jenkins and GitHub Actions, streamlining development, testing, and deployment lifecycles for Snowflake dbt workflows in healthcare and pharma systems.

Executed complex analytics queries on Snowflake, extracting high-performance insights to optimize patient care strategies, drug supply chain operations, and clinical trial management.

Implemented and optimized data integration workflows using dbt and AWS Glue, ensuring seamless data flow across healthcare systems, patient data repositories, and pharma databases for better operational decision-making.

Collaborated with cross-functional teams within Agile and Scrum frameworks, delivering scalable and innovative data solutions aligned with healthcare modernization and pharma industry advancements.

Client: First Citizens Bank Raleigh, NC Aug 2016 to Sep 2018

Data Engineer

Functional Role Details:

Architected and optimized robust data pipelines using PySpark, ensuring seamless processing, transformation and enrichment of large-scale banking datasets while maintaining data integrity and security.

Architected and optimized robust data pipelines using Talend, ensuring seamless processing, transformation, and enrichment of large-scale banking datasets, while maintaining data integrity and security for financial data.

Managed and enhanced relational databases using Oracle DB, implementing encryption at rest and in transit, indexing, partitioning, and performance tuning to support secure, high-volume transactions in banking applications.

Developed role-based access controls (RBAC) and fine-grained permissions for secure backend database integration using Talend and SQLAlchemy, ensuring least-privilege access to sensitive banking data.

Built and deployed secure RESTful APIs using Django, implementing OAuth2, JWT authentication, and API rate limiting to prevent unauthorized access and mitigate security risks in banking systems.

Orchestrated and automated secure ETL workflows using Talend and Apache Airflow, enforcing data masking, anonymization, and tokenization to protect sensitive personally identifiable information (PII) in compliance with banking regulations.

Designed and implemented real-time data streaming solutions using Apache Kafka and Talend, ensuring end-to-end encryption and secure message authentication to protect financial transactions and real-time banking data.

Leveraged Oracle DB and Talend for distributed processing of massive banking datasets, optimizing performance while enforcing data security policies and access controls to meet banking compliance standards.

Automated infrastructure provisioning and security configuration using Oracle DB and Terraform, ensuring compliance with PCI-DSS and SOC 2 standards for secure banking operations.

Developed and managed advanced Bash scripts to automate security tasks, such as certificate renewals, firewall rule updates, and log rotation to prevent unauthorized access in banking systems.

Created and delivered interactive dashboards using Oracle BI and Talend, integrating security monitoring data from Oracle Audit Vault and Oracle Cloud Infrastructure (OCI) for proactive threat detection in financial operations.

Built and maintained CI/CD pipelines with Jenkins and Talend Cloud, incorporating SAST/DAST security scans (using tools like SonarQube and OWASP ZAP) and automated compliance checks into the software deployment lifecycle for banking applications.

Scheduled and automated recurring data processing tasks using Oracle DB Jobs and Talend, integrating audit logging and monitoring alerts to detect suspicious activities and unauthorized access in banking systems.

Implemented container security best practices by scanning Docker images for vulnerabilities and securing deployments using Oracle Cloud and Talend on private cloud infrastructure for banking services.

Deployed Oracle WebLogic Server and Oracle WAF to mitigate DDoS attacks and prevent SQL injection and XSS vulnerabilities in public-facing banking applications.

Executed manual deployments of containerized applications using Shell scripts, enforcing access logging and firewall rules to minimize security risks during banking system rollouts.

Conducted vulnerability assessments and penetration testing on cloud and on-premise systems, leveraging tools like Nmap, Nessus, and Metasploit to identify and remediate security gaps in banking environments.

Implemented SIEM (Security Information and Event Management) solutions using Oracle Security Monitoring and Talend Data Integration, centralizing log analysis, anomaly detection, and security alerts in real-time for financial operations.

Collaborated with cross-functional teams in Agile environments to integrate security best practices (DevSecOps) into the development lifecycle, ensuring secure and compliant data workflows in banking systems.

Client: Costco Issaquah, WA Feb 2013 to July 2016

Data Engineer

Functional Role Details:

Analyzed and optimized complex datasets using Python (Pandas, NumPy) and SQL, implementing query profiling and execution tuning to enhance performance and reduce processing time.

Designed and developed interactive dashboards using Tableau, QlikView and Crystal Reports, leveraging data extracts, indexing and query optimization for improved visualization responsiveness.

Wrote and optimized SQL queries in MySQL and PostgreSQL, using advanced indexing (BTREE, GiST), materialized views and caching mechanisms to accelerate data retrieval.

Developed and streamlined ETL workflows using Talend and Informatica, optimizing data ingestion through parallel processing, bulk inserts and transformation logic enhancements.

Automated query performance monitoring and logging with Python and Bash scripts, integrating real-time alerts using Nagios and Splunk to detect and resolve performance bottlenecks.

Implemented schema optimization strategies such as normalization, denormalization, partitioning and columnar storage (Parquet, ORC) to improve data processing speed and scalability.

Enhanced real-time data streaming efficiency using Apache Kafka and Redis caching, reducing query execution latency for frequently accessed data.

Developed and maintained CI/CD pipelines with Jenkins, incorporating automated SQL testing, static code analysis (SonarQube) and compliance validation.

Monitored system performance using SQL Profiler, Splunk and Apache JMeter, conducting load testing, stress testing and resource allocation tuning.

Optimized stored procedures and complex queries using window functions, CTEs, query parallelization and connection pooling, improving report generation efficiency.

Improved workflow orchestration with Apache Airflow, optimizing DAG execution for large-scale data processing while reducing redundant computations.

Secured data processing workflows by enforcing LDAP-based authentication, data encryption with OpenSSL and access control mechanisms, ensuring compliance with data governance standards.

Integrated distributed computing frameworks (Hadoop, Apache Spark and Hive) to handle high-volume transactional data efficiently, optimizing computational workloads.

Led performance tuning initiatives, implementing query execution plan analysis, index restructuring and query caching strategies, resulting in a 40% improvement in overall query efficiency.

Contact this candidate