Sai Charan Pesaru
Fremont, CA *********.********@*****.*** 408-***-**** https://www.linkedin.com/in/saicharanpesaru/ Professional Summary
• Accomplished Data Engineering with 5 years of experience specializing in designing, building and optimizing scalable data pipelines on Azure, AWA and GCP cloud platforms.
• Proven expertise in Snowflake Cloud Data Warehousing, leveraging SQL and Python for robust data transformation, querying and analytics.
• Strong command of ETL/ELT tools (dbt, Informatica, Airflow), Big Data technologies (Hadoop, Spark), and modern data warehouses
(Redshift, Databricks).
• Solid background in CI/CD (GitHub, Azure DevOps), containerization (Docker, Kubernetes), and IaC (Terraform) for streamlined data solution deployment.
• Experienced in delivering end-to-end data solutions, including data visualization with Tableau and Power BI, and complex data modeling
Technical Skills
Programming Languages Python (NumPy, Pandas, Scikit-Learn, Matplotlib), SQL, Scala, R (Statistical Analysis), Java, Bash, Unix/Linux Shell Scripting.
Databases NoSQL (MongoDB, Cassandra), MySQL, MS SQL Server, Oracle, Snowflake, PostgreSQL, Teradata. ETL Tools Informatica PowerCenter, Talend, Informatica Intelligent Data Management Cloud (IICS/IDMC), Azure Data Factory (ADF), Apache Airflow.
Big Data & Cloud Technologies Azure (ADF, ADLS Gen 2, Databricks, Synapse), AWS, Google Cloud Platform (GCP), Hadoop (HDFS, MapReduce), Apache Spark, Hive, Pig, Kafka, Control-M. Visualization Tools Tableau, Power BI (Power Query, DAX), Apache Superset, MS Excel. Other Skills Data Warehousing, Data Modelling, Machine Learning, Data Pipeline Architecture, Data as a Product, RPA, CI/CD Pipelines, GitHub, Jenkins, Bamboo, JMeter, SonarQube, Agile/Scrum, SharePoint, Deployment Automation, Cross-functional Team Leadership, Compliance Data Mapping.
Professional Experience
Data Engineer Oct 24- Present
Charles Schwab, California
• Collaborated with cross-functional teams to build efficient, reusable data pipelines for model serving and monitoring.
• Designed, built, and maintained robust, scalable data pipelines using Snowflake’s cloud data platform.
• Implemented and managed ETL/ELT pipelines to ingest data from diverse sources into Snowflake.
• Designed and developed curated data products using Snowflake.
• Developed and managed both cloud-based and on-premises data platforms.
• Led execution of production deployments and release automation using GitHub, and custom Python scripts.
• Collaborated with product owners, project managers, and QA to prioritize backlog, decompose epics, and drive sprint deliverables for data innovation initiatives.
• Designed data models and created artifacts for operational support and deployment, ensuring business continuity and system reliability.
• Built and maintained CI/CD pipelines for deploying data infrastructure and code using GitHub Actions, reducing deployment time by 40%.
• Spearheaded agile development of data platforms and pipelines, increasing data delivery velocity by 30% while mentoring offshore engineers.
Data Engineer May 23- Aug 24
Abbott, California, USA
• Developed Snowflake-based data models, improving query response times by 30% and enabling faster regulatory reporting.
• Designed and monitored operational dashboards using Azure Monitor and custom Python scripts.
• Strategized and implemented backup solutions and replication protocols to ensure data integrity.
• Led SQL Server patching initiative, increasing system uptime by 25% and reducing production downtime incidents by 40%.
• Managed structured and unstructured data sources, organized data efficiently, and created reusable data assets using both open- source and proprietary tools.
• Coded in SQL to create curated data products supporting regulatory reporting, and operational decision-making.
• Implemented cloud data security protocols and access control mechanisms to safeguard sensitive healthcare and patient data in compliance with HIPAA and industry standards.
• Collaborated with cross-functional technology teams to define deployment strategies, prioritize features, and ensure smooth integration of operational components across systems.
• Integrated data from systems and APIs into a centralized master database, building Power BI dashboards that enabled real-time insights, cutting reporting time by 60%.
Data Analyst Aug 22 – May 23
California State University, San Bernadino, USA
• Built intuitive dashboards and reporting tools using Tableau and Power BI.
• Implemented and maintained CI/CD pipelines, decreasing deployment time for data workflows by 50%.
• Enhanced production stability by 40% by automating testing and validation in CI/CD workflows across academic analytics projects.
• Advanced SQL skills including complex queries, stored procedures, and view creation.
• Experience with relational (PostgreSQL, SQL Server) and NoSQL (Cassandra) databases.
• Proficient with ETL tools such as Apache Airflow, dbt, and Azure Data Factory.
• Demonstrated excellent troubleshooting, debugging, and problem-solving abilities.
• Investigated and resolved data quality issues across analytics dashboards, boosting dashboard accuracy by 35%.
• Applied version control best practices and CI/CD pipelines, enhancing deployment efficiency by 40% across student projects.
• Applied Git-based version control best practices, reducing merge conflicts by 35% and improve team collaboration on data pipeline.
• Diagnosed and resolved critical data pipeline failures, reducing incident resolution time by 60% and minimizing data delivery delays.
• Led root cause analysis of recurring data quality issues, implementing long-term fixes that reduced data anomalies by 30%. Data Engineer Aug 21 – Aug 22
Infosys Ltd, Hyderabad
• Wrote optimized SQL queries and stored procedures for large-scale data manipulation, reporting, and data quality assurance.
• Automated data workflows using Python and integrated with REST APIs, S3, and cloud storage for end-to-end data orchestration.
• Participated in Agile ceremonies and worked in cross-functional teams to deliver high-impact data products on time.
• Conducted root cause analysis on data issues, proactively fixing pipeline failures and data inconsistencies to maintain trust in reporting systems.
• Collaborated with cross-functional teams to define data transformation logic and improve integration success.
• Designed data pipelines using Apache Airflow and AWS Glue, reducing data latency by up to 50%.
• Implemented monitoring and validation systems using CloudWatch and Azure Monitor, cutting incident response times by 45%.
• Actively contributed to sprint planning and backlog grooming, helping reduce delivery bottlenecks and improving data product release cadence by 30%.
• Collaborated with cross-functional teams to deliver end-to-end data solutions, contributing to a 25% improvement in data accessibility for analytics use cases.
Data Analyst May 20 – July 21
Wipro Technologies, Hyderabad
• Deployed infrastructure as code (IaC) for data environments using Terraform or AWS CloudFormation.
• Designed and maintained scalable data pipelines on cloud platforms such as AWS
• Designed data models and optimized query performance on Redshift, Big Query, and Snowflake.
• Migrated legacy data pipelines to modern platforms, improving performance and reducing maintenance overhead.
• Created custom Python modules for data validation and logging, reducing data errors across pipelines by 40%.
• Created interactive dashboards and data visualizations using Power BI, Tableau, and Quick Sight.
• Migrated legacy data pipelines to modern platforms, improving performance by 55% and reducing maintenance overhead by 30%.
• Designed data models and optimized queries on Redshift and Snowflake, increasing query execution efficiency by 50%. Projects
Academic Data Pipeline Automation with CI/CD and Airflow
• Orchestrated end-to-end academic data pipelines using Apache Airflow, enabling scalable, modular workflows for automated reporting and data refresh cycles.
• Implemented data ingestion and transformation workflows in Azure Data Factory, integrating diverse academic data sources into a centralized reporting environment.
• Developed CI/CD pipelines using GitHub Actions, automating testing, linting, and deployment processes—reducing deployment time by 50% and improving delivery consistency.
Root Cause Analysis on Data Quality Issues in University Systems:
• Conducted in-depth root cause analysis using SQL, Python, and historical PostgreSQL logs, identifying key data anomalies and reducing recurring issues by 30%.
• Developed and implemented SQL-based validation pipelines to detect and prevent data integrity issues, improving trust in analytics outputs used by university stakeholders.
• Leveraged Azure Monitor and Excel for system-level diagnostics and reporting, and used Git for version control to ensure traceable, collaborative fixes to core data issues.
Education
MS Information Systems & Technology, California State University-San Bernardino Aug 22 – Aug 24 B. Tech Computer Science Engineering, GITAM University July 17 – June 21