Data Analyst - ML and BI Specialist with 3+ Years

Location:

United States

Posted:

February 04, 2026

Contact this candidate

Resume:

NANDINI KONGANI

Email: ****************@*****.*** Cell: +1-475-***-**** GitHub: https://www.linkedin.com/in/nandinikongani/ Summary

Data Analyst with over 3 years of experience specializing in data analysis, machine learning, and data visualization. Proficient in utilizing SQL, Python, and Power BI for data processing, analysis, and reporting. Expertise in working with large datasets, identifying trends, and providing actionable insights. Strong experience in building automated reporting systems, collaborating with cross-functional teams, and ensuring data quality and consistency across business initiatives. Technical Skills

Programming & Analytics: Python, SQL, Spark SQL, PySpark, Power BI, Data Modeling, Semantic Layers, Data Quality (Great Expectations), Git, CI/CD Cloud Platforms & Data Engineering: Azure Data Factory, Azure Databricks, Azure Synapse, ADLS Gen2, Apache Spark, Delta Lake, ETL/ELT, Airflow, Apache Kafka Machine Learning & AI: Feature Engineering, ML Pipelines, Generative AI, Large Language Models (LLMs), Retrieval-Augmented Generation (RAG), Hugging Face, Model Evaluation

Data Visualization & Tools: Tableau, JMP, matplotlib, seaborn Big Data: Hadoop ecosystem (Spark, HDFS, Presto, etc.) Work Experience

Data Engineer at Zoetis

July 2025– January 2026 Global Manufacturing & Supply Chain– Data Analytics Engineer

• Designed and implemented 15+ parameterized ingestion pipelines using Azure Data Factory and Databricks, optimizing data availability and reducing maintenance by ~30%.

• Built semantic layers and 20+ analytical views in Azure Synapse, powering Power BI dashboards for executive KPI tracking.

• Led migration of 50+ Databricks notebooks, improving metadata standardization and access control.

• Developed Apache Kafka-based real-time ingestion pipelines, processing 5M+ events daily and reducing latency by ~70%.

• Built a retrieval-augmented generation (RAG) prototype using Hugging Face, enhancing data-driven insights. Data Engineer at Stevens Institute of Technology

September 2024– May 2025 Customer 360 & Real-Time Analytics – Data Platform & Quality Engineering

• Redesigned real-time ingestion pipeline, reducing data latency from 4 hours to 15 minutes for Customer 360 analytics.

• Built 12 automated data quality checks using PySpark and Great Expectations, eliminating manual steps and improving data quality by 90%.

• Delivered Power BI dashboards and analytics-ready datasets, supporting decision-making for 10+ stakeholders.

• Defined monitoring strategies to improve pipeline uptime to >99.5% and reduced failure detection time by ~60%. Data Engineer at Great Learning

May 2022– May 2023 Scalable Analytics, Machine Learning & Generative AI Enablement

• Built scalable, high-reliability data pipelines processing 10M+ records daily with >99.9% reliability.

• Developed feature-engineering functions in PySpark, reducing refactoring efforts and accelerating ML model deployment.

• Optimized cloud usage, saving $180K/year and improving query performance by ~30%.

• Built a proof-of-concept RAG workflow, reducing time spent locating content by ~65% for internal teams. Data Engineer Intern at Tuzen Tech Solutions

September 2022– Feb 2023 Transaction Analytics, Data Reliability & AI Enablement

• Developed monitoring and validation systems for transaction analytics, ensuring 1-hour SLA compliance.

• Prototyped an ML-based anomaly detection system, reducing manual log review time by ~30%. Projects

Gen AI Application

• Developed a retrieval-augmented generation (RAG) prototype using Hugging Face and vector search for business question answering.

• Designed and implemented prompt templates to enhance internal AI assistant accuracy. Customer 360 Executive Initiative

• Redesigned real-time ingestion pipeline, reducing latency from 4 hours to 15 minutes, enabling real-time analytics for customer data.

• Built and deployed automated data quality checks using PySpark and Great Expectations for improved business decisions. Scalable Analytics for Machine Learning

• Built scalable end-to-end data pipelines processing 10M+ records daily with >99.9% reliability.

• Developed feature-engineering functions in PySpark, reducing ML model deployment time by ~4 weeks. Real-Time Transaction Analytics

• Created end-to-end monitoring for a transaction analytics pipeline, ensuring high reliability and SLA compliance.

• Developed an ML-based anomaly detection system, streamlining operational processes. Education

Master of Science in Computer Science

Stevens Institute of Technology, May 2025

Bachelor of Engineering in Computer Science

Institute of Aeronautical Engineering, India, May 2023 Certifications

Databricks Certified Data Engineer Associate

Databricks Certified Data Engineer Professional

Contact this candidate