Post Job Free
Sign in

Data Analytics Specialist with Cloud & ETL Focus

Location:
New York City, NY
Posted:
November 13, 2025

Contact this candidate

Resume:

Chaitya Sanghavi Data Analyst

+1-480-***-**** *****************@*****.*** linkedin.com/in/sanghavi-chaitya/ SUMMARY

Results-driven Data Analyst with 4+ years of experience designing and scaling data platforms, ETL pipelines, and ML-powered analytics workflows. Skilled in Python, SQL, Snowflake, Databricks, Spark, dbt, Airflow, and cloud platforms (AWS, GCP, Azure). Experienced in improving data availability, processing efficiency, and enabling data-driven decision-making and machine learning adoption. TECHNICAL SKILLS

Programming Languages: Python, SQL, PySpark, Scala, Java, JavaScript Data Engineering: dbt, Databricks, Apache Airflow, Apache Kafka, Apache Spark Databases: Snowflake, Amazon Redshift, Google BigQuery, PostgreSQL Analytics & ML: Tableau, Power BI, Looker Studio, TensorFlow, XGBoost, NLP Cloud Platforms: AWS (S3, DynamoDB, EKS), Azure (Data Factory, Synapse, Cosmos DB), GCP (BigQuery, Data Fusion, Dataproc) Certifications: AWS Solutions Architect - Associate PROFESSIONAL EXPERIENCE

BlueGenAI, USA Jan 2025 - Current

Data Analyst

● Migrated SQL Server to PostgreSQL using Python/dbt pipelines, reducing onboarding time by 60% and improving analytics efficiency.

● Streamlined Databricks-Snowflake ETL workflows, improving data processing speed by 30% and cutting reporting delays by 40%.

● Unified CRM and MRM datasets with Airflow and PySpark, increasing data availability by 70% and accelerating reporting cycles.

● Automated validation, monitoring, and logging workflows in PySpark and Airflow, reducing downstream errors and production failures.

● Optimized SQL queries and schemas, improving dashboard responsiveness by 35% and delivering insights to leadership more quickly.

● Streamlined dbt deployments via CI/CD, shortening release cycles from 7 days to 2 days and standardizing environment configurations. EdPlus, USA Jun 2023 - Dec 2024

Data Analyst

● Developed advanced SQL queries in BigQuery on Google Analytics data, improving accuracy by 50% and supporting decision-making.

● Built interactive dashboards in Looker Studio, enabling executive leadership to visualize and interpret student engagement trends.

● Automated Optimizely API data ingestion into BigQuery using Terraform, saving 3 hours weekly and streamlining analytics workflows.

● Conducted A/B test across 10+ campaigns reaching 50K users, guiding marketing investment decisions and improving conversion rates.

● Consolidated multiple datasets into a centralized BigQuery warehouse, reducing redundancy and improving reporting consistency. DXFactor Solutions, India Dec 2019 - Dec 2022

Data Engineer

● Designed and optimized Python, SQL, and dbt pipelines with Kafka, reducing runtime by 35% across finance and fitness datasets.

● Leveraged Databricks for large-scale Spark processing and integrated outputs with Snowflake enhancing cross-platform data availability.

● Built Spark and PySpark connectors to transform raw data into structured schemas, reducing data extraction time by 55 minutes.

● Implemented automated validation workflows in Apache Airflow, reducing anomalies and improving overall system reliability.

● Created Flask REST APIs exposing 100+ KPIs in Tableau dashboards, enabling immediate executive analytics visibility.

● Led development of ML pipeline to process 90GB/day of video data, applying Blaze Pose for accurate body measurement estimation.

● Mentored engineers and interns, sharing best practices and improving collaboration across data engineering projects. EDUCATION

Master of Science, Computer Science Dec 2024

Arizona State University, Tempe, AZ

Relevant Coursework: Cloud Computing, Database Management Systems, Distributed Database Systems, Mobile Computing Bachelor of Technology, Computer Science May 2020

Ahmedabad University, Ahmedabad, India

Relevant Coursework: Data Structures and Algorithms, Data Analytics and Visualization, Linear Algebra, Big Data Systems, Operating Systems, Software Engineering, Machine Learning, Object-Oriented Programming ACADEMIC / TECHNICAL PROJECTS

Misinformation Detection with Large Language Models: A Multi-Paradigm Approach Python

● Applied transformer-based LLMs (Llama-2, 7B) using zero-shot, few-shot, and supervised learning, leveraging advanced NLP techniques for misinformation detection and text classification.

● Fine-tuned models achieving 87% Macro F1 Score, demonstrating proficiency in NLP, LLM architectures, and applied ML workflows. Invoice Validation Using OCR Python, MySQL, OpenCV, TensorFlow, CNN, GAN

● Developed an invoice validation pipeline using Tesseract OCR, OpenCV, and CNNs for classification and information extraction.

● Reduced manual effort by 5 hours weekly through automation, improving accuracy and consistency in financial validation workflows. Rental Autonomous Vehicles Python, AWS, MySQL, MongoDB, CARLA

● Built a web-based application for renting autonomous vehicles, integrating Python, AWS cloud services, and CARLA Simulator.

● Streamlined booking and simulation workflows, enabling real-time testing of autonomous driving scenarios and improving system scalability for multi-user environments.



Contact this candidate