SHARATH CHANDRA
+* (***) *** - **** **********@*****.*** Maryland Heights, MO LinkedIn
PROFESSIONAL SUMMARY
Full-Stack Data Engineer: Six years of experience delivering end-to-end data solutions across fintech, edtech, and banking domains. Proven ability to build scalable pipelines, automate workflows, and support advanced analytics across hybrid cloud ecosystems.
Cloud-Native Engineering: Hands-on expertise across GCP (BigQuery, Dataflow), Azure (Data Factory, Databricks, SQL), AWS (Redshift, S3, IAM), and Snowflake. Architected robust ETL/ELT systems handling 500+ GB/day using modern frameworks like PySpark, Airflow, and Terraform.
Modern Tech Stack: Advanced proficiency in Python (Pandas, NumPy), SQL (T-SQL, BigQuery SQL), PySpark, Jenkins, GitHub Actions, and CI/CD workflows. Experience with schema evolution, real-time streaming, RBAC, and data lineage in enterprise environments.
Business Intelligence & ML Enablement: Collaborated cross-functionally to build churn prediction models, ML feature pipelines, and performance dashboards using Power BI, Tableau, and Google Data Studio, directly increasing operational efficiency and executive visibility.
Data Governance & Compliance: Enforced secure data practices using RBAC, IAM, and Terraform-based provisioning across GCP, Azure, and AWS platforms—ensuring SOC 2 and PCI DSS compliance while supporting seamless CI/CD deployment cycles.
Impact-Driven Projects: Reduced dashboard refresh latency, increased campaign targeting efficiency by 15%, and enabled near real-time customer analytics through system optimization and infrastructure-as-code (IaC) strategies.
TECHNICAL SKILLS
Technical Category
Technical Skills
Cloud Platforms & Data Warehouse
GCP (BigQuery, GCS, Dataflow), AWS (S3, Redshift, IAM), Azure data bricks, Synapse, Snowflake
Programming & Scripting
Python, SQL, PySpark, Apache Spark
Data Processing & Engineering
Pandas, NumPy, data matching & deduplication, aggregation, schema evolution, ETL/ELT pipeline design, data modeling, data cleaning, transformation
Data Orchestration & Automation
Airflow, Jenkins, Terraform
Visualization & BI Tools
Tableau, Power BI, Google Data Studio, Excel, Google Sheets
Version Control & Collaboration
Git, GitHub, Jira, and documentation in Jupyter Notebooks
Data Security & Compliance
PCI DSS, SOC 2 standards, IAM access controls, secure AWS resources
Exploratory Data Analysis
Python (Pandas, NumPy, Matplotlib, Seaborn), ad hoc analysis in Excel
Reporting & Dashboarding
Interactive dashboards, real-time data updates, actionable insights, and executive-level visuals
CERTIFICATIONS
• PCEP – Certified Entry-Level Python Programmer, Python Institute
• Power BI Data Analysis & Visualization, Certificate of Completion – Forage
EXPERIENCE
Closeloop Technologies
Data Engineer Sep 2024-till date Mountain View, CA
Designed and deployed enterprise-grade ETL/ELT pipelines using Azure Data Factory and Azure Databricks, enabling ingestion and transforming data from Oracle CRM and external APIs into Azure Data Lake Storage Gen2 and Azure SQL Database.
Developed, supported, and optimized advanced data structures and data warehouse objects within GCP Big Query, designing and implementing scalable data pipelines of daily customer data, directly contributing to BI development.
Developed, implemented, and maintained ETL processes using GCP Dataflow and PySpark for massive datasets, ensuring efficient data flow and transformation and optimizing solutions to support the health of the data environment.
Gathered business requirements through collaboration with analysts and AI teams to operationalize churn prediction models, translating needs into automated data feeds and versioned SQL models.
Created and maintained SQL Code (BigQuery SQL) with considerations to optimization and future supportability, improving dashboard refresh rates and providing near real-time customer churn insights.
Monitored data quality and validation metrics by developing comprehensive data quality checks within Dataflow pipelines, proactively identifying data issues, and reducing reporting errors by 25%.
Enhanced and supported the CI/CD automation needs of the business by leading data pipeline CI/CD using Airflow DAGs, GitHub Actions, and Terraform, supporting agile releases and infrastructure as code.
Developed reporting and data extracts for use by business customers by building executive dashboards using Google Data Studio and Power BI, contributing to a 15% boost in campaign targeting efficiency
Followed and maintained documentation related to data engineering processes and workflows, including reusable data pipeline templates in GCP, promoting engineering efficiency and future supportability.
Partnered with cross-functional stakeholders, including Data Science and EDW teams to support churn prediction initiatives, converting high-volume telemetry and CRM data into actionable KPIs.
Integrated Pandas and NumPy for initial data exploration and cleansing of small datasets, followed by scalable processing in PySpark for large-scale analyses.
Optimized BigQuery SQL queries to handle large-scale joins and aggregations, improving dashboard refresh rates and providing near real-time customer churn insights.
Orchestrated robust schema evolution strategies in BigQuery and Dataflow, adapting pipelines to handle new data sources like 5G usage patterns seamlessly, ensuring data compatibility and minimizing manual adjustments.
Automated data ingestion and batch transformation processes leveraging Dataflow pipelines and GCS triggers, ensuring real-time data updates in BigQuery for predictive churn modeling and downstream BI.
Employed aggregation techniques in PySpark and BigQuery to compute key metrics like monthly usage trends and customer support calls, enhancing churn prediction accuracy.
Collaborated on building reusable data pipeline templates in GCP for future predictive analytics projects, accelerating onboarding of new data sources by 40% and promoting engineering efficiency.
Collaborated with analysts and AI teams to operationalize churn prediction models through automated data feeds and versioned SQL models.
Built executive dashboards using Google Data Studio and Power BI, contributing to a 15% boost in campaign targeting efficiency.
Enhanced data governance by implementing data lineage tracking, auditing, and RBAC-controlled access.
Collaborated closely with marketing and product teams to translate churn risk data into tailored retention strategies, using insights derived from Power BI visualizations.
Work Environment: Azure Data Factory, Azure Databricks, Azure SQL, GCP BigQuery, GCP Dataflow, Airflow, Power BI, Google Data Studio, PySpark, Pandas, NumPy, GitHub, Terraform
JPMC CHASE
Data Engineer June 2023 – June 2024
St.Louis, MO
Leveraged Python (Pandas, NumPy) and Apache Spark to process 6 months of customer interaction logs, reducing analysis time from hours to minutes and enhancing decision-making for product features.
Automated daily ETL pipelines using Airflow and Jenkins (CI/CD), integrating data from Snowflake and AWS S3, ensuring near-real-time insights for customer experience improvements and operational efficiency.
Implemented Terraform scripts to provision AWS infrastructure (S3, Redshift, IAM), streamlining deployments, ensuring consistent environments, and maintaining stringent security compliance.
Engineered and orchestrated Python data transformation scripts with Airflow DAGs and Jenkins CI/CD pipelines, maintaining data accuracy and workflow stability across multiple environments through robust dependency management and error handling.
Applied PCI DSS and SOC 2 compliance standards and IAM access controls in the design and implementation of data solutions within Snowflake and AWS services, ensuring sensitive customer data protection.
Managed scalable data storage and analytics by integrating AWS Redshift with Snowflake, enabling secure, high-performance queries for historical customer transactions and supporting advanced trend analysis.
Collaborated cross-functionally using Jira (Agile workflow) for task tracking and GitHub for version control, ensuring transparent collaboration and rapid iteration of data solutions.
Automated the creation of AWS resources (S3, Redshift, IAM) using Terraform, significantly reducing manual errors and increasing deployment consistency.
Designed and deployed scalable SQL models for performance tracking dashboards in Power BI and Tableau.
Utilized Terraform and Jenkins to build secure AWS environments, ensuring compliance with SOC 2 and PCI DSS.
Created and optimized feature extraction pipelines to support downstream analytics and AI model readiness.
Conducted metadata and schema management across cross-functional teams to support dynamic analytics use cases.
Developed interactive dashboards in Power BI to highlight customer engagement and drop-off trends, sourcing data through complex SQL queries from Snowflake.
Employed Apache Spark and Snowflake to analyze large volumes of structured and unstructured data, enabling advanced trend analysis and predictive insights for marketing strategies.
Collaborated cross-functionally using Jira for task tracking and GitHub for version control, ensuring transparent collaboration and rapid iteration of data solutions.
Engineered Python-based feature usage analysis to identify underutilized app features, driving product redesigns and personalized marketing campaigns.
Utilized Power BI to create executive-level visualizations of mobile deposit and bill payment trends, transforming SQL-based insights into actionable recommendations.
Conducted feature engineering and visualization using Python (Seaborn, Pandas), with data fetched from Snowflake via SQL, driving product feature enhancements.
Enabled secure, governed analytics workflows by integrating IAM access controls into Terraform-based AWS deployments, ensuring only approved users accessed sensitive data.
Work Environment: Python (Pandas, NumPy), Apache Spark, AWS S3, Redshift, Snowflake, Terraform, Jenkins, Airflow, Tableau, Power BI, GitHub, Jira
Byjus India
Associate Data Engineer July 2020 – March 2022
Developed Python (Pandas, NumPy) scripts to identify and merge duplicate student records, ensuring accurate and reliable analytics on Byjus’s large-scale datasets.
Built automated data aggregation pipelines in SQL and Python, providing key insights like average quiz scores and video engagement. for thousands of students and ensuring data freshness for reporting.
Utilized Airflow to schedule daily data ingestion and transformation jobs, ensuring dashboards were always up-to-date with the latest student performance data and maintaining pipeline reliability.
Set up and managed cloud infrastructure resources using Terraform, automating the deployment of data warehouses and ensuring consistent development and production environments.
Leveraged Snowflake’s schema evolution features to accommodate new data fields as Byjus' interactive videos and innovative features were introduced, maintaining data model adaptability.
Conducted extensive data cleaning and deduplication using advanced SQL queries and Python scripts, preparing high-quality data for analysis and reporting at scale.
Enhanced the reliability of data pipelines by integrating Jenkins to automatically test and deploy Python scripts and dashboards, supporting CI/CD practices.
Documented data cleaning strategies, Python workflows, and dashboard logic in Jupyter Notebooks, ensuring clarity, reusability, and efficient knowledge transfer within the team.
Created detailed weekly engagement and performance reports using a combination of Excel, SQL, and Tableau, delivering insights to both teachers and management.
Automated data workflows by integrating Airflow with Snowflake and AWS resources, reducing manual work and ensuring timely delivery of accurate data.
Documented data cleaning strategies, Python workflows, and dashboard logic in Jupyter Notebooks, ensuring clarity and reusability for the team.
Collaborated cross-functionally using Jira to track and manage tasks such as dashboard updates, data cleaning tasks, and new analytics requirements.
Enhanced the reliability of data pipelines by integrating Jenkins to test and deploy Python scripts and dashboards automatically.
Conducted quick ad hoc data analyses in Excel and Python to answer urgent business questions and support management decisions.
Work Environment: Python (Pandas, NumPy), SQL, Airflow, Terraform, Jenkins, Snowflake, Excel, Tableau, AWS
DXC Technologies
Data Analyst July 2017 – Nov 2020
Assisted in data cleaning and transformation using Python libraries such as Pandas and NumPy to ensure consistency and accuracy in customer purchase data.
Wrote basic SQL queries to pull data from MySQL and SQL Server databases, supporting deeper analysis of sales trends and customer behavior.
Created simple data visualizations in Excel and Google Sheets to compare monthly sales performance and identify peak sales periods.
Used Google Analytics to gather insights on website traffic sources, helping the team understand which channels brought in the most visitors.
Collaborated with senior team members to design interactive dashboards in Tableau, learning how to present key sales metrics and trends effectively.
Documented data analysis workflows in Jupyter Notebooks, providing step-by-step records of how data was cleaned and analyzed for reports.
Generated automated summary reports in Excel that captured key findings, reducing the time needed for manual data compilation.
Participated in the development of Power BI reports to enable managers to filter and drill down into regional sales performance.
Learned to merge insights from Google Analytics and SQL data to understand customer acquisition and retention patterns better.
Contributed to creating visually appealing charts in Matplotlib and Seaborn for team presentations, translating data insights into clear visuals.
Collaborated with team members to ensure data integrity throughout the project, cross-checking data sources and updating documentation in Jupyter Notebooks.
Gained experience working in a collaborative environment, supporting cross-functional teams in using data-driven insights to enhance marketing strategies.
Work Environment: SQL (MySQL, SQL Server), Python (Pandas, NumPy), Excel, Tableau, Google Sheets, Google Analytics, Jupyter Notebooks
ACHIEVEMENTS
Built automated data pipelines in GCP to process customer data daily, enabling real-time churn insights and boosting marketing success by 15%.
Automated ETL pipelines with Airflow and Jenkins (CI/CD), integrating data from Snowflake and AWS S3, reducing analysis time from hours to minutes, and significantly improving customer experience insights.
Accelerated onboarding of new data sources by 40% by building reusable data pipeline templates in GCP, demonstrating efficiency in expanding data platform capabilities.
Optimized data transformation workflows using PySpark and Dataflow, resulting in a 40% reduction in data processing time for massive datasets.
Implemented Terraform scripts for cloud infrastructure provisioning (AWS), leading to streamlined deployments and improved security compliance.
EDUCATION
• Southeast Missouri State University, Cape Girardeau, Missouri
Master's degree in Computer and Information Systems.
• JNTUH, Hyderabad, India
Bachelor of Technology in Engineering.