Machine Learning Data Analyst

Location:

Trenton, NJ

Posted:

October 15, 2025

Contact this candidate

Resume:

NIHAR DUGADE

New Jersey Open to Relocation

Phone: 551-***-**** Email: *************@*****.***

LinkedIn: https://www.linkedin.com/in/nihar-dugade/ GitHub: https://github.com/NIHARDUGADE

Summary

Results-driven Data Analyst and aspiring Data Engineer with hands-on experience building scalable data pipelines, automating ETL workflows, and deploying machine learning models in cloud environments. Proven success transforming high-volume, complex datasets into actionable insights that improved uptime, reduced latency, and enhanced decision-making across departments. Adept in tools like Apache Spark, Kafka, AWS, and Snowflake, with a track record of reducing reporting time by 40% and unplanned outages by 20%. Known for combining technical depth with stakeholder collaboration to deliver data solutions that drive measurable business value.

Technical Skills

Tableau AWS QuickSight Apache Airflow DBT Git Kafka AWS (S3, Glue, Athena, Lambda, SageMaker) Snowflake Amazon Redshift PostgreSQL MySQL NoSQL Python R SQL

Projects

Retail Sales & Apple Stock Time Series Forecasting GitHub

●Built SARIMA & ARIMA models (Box Jenkins) that beat baseline AIC/BIC, forecasting 12 month retail sales and 30 day AAPL prices.

●Developed end to end Python pipeline (pandas, statsmodels) to clean 30 yr retail & 10 yr stock data and automate grid search, cutting prep time 40 %.

●Delivered dashboards & briefs with ADF/Ljung Box validated forecasts, arming planners and portfolio managers with actionable insights.

HR Attrition Analysis GitHub

●Built an end-to-end HR attrition analysis pipeline using Python to identify key drivers of employee churn, achieving a model ROC-AUC of 0.83.

●Engineered performance and engagement-based features (e.g., tenure buckets, pay equity scores) and conducted EDA to uncover high-attrition segments such as early-tenure and overtime-heavy roles

●Visualized and communicated actionable insights through correlation heatmaps and stacked bar charts, recommending strategies to reduce attrition from 16% to ~11%.

Future Stock Prediction: Image-Driven Approach GitHub

●Designed and optimized CNN models using GPU-accelerated training on image-based stock data, enhancing S&P 500 prediction accuracy and model efficiency.

●Explored autoencoders and Gramian Angular Field transformations for enhanced feature extraction.

Experience

Data Analytics & Machine Learning Trainee Jul 2025 - Present

ElevateMe Columbus, Ohio

●Completed structured training with 70+ hours dedicated to capstone projects focused on data analytics and machine learning solutions.

●Built and evaluated machine learning models using Python libraries including Pandas, NumPy, Scikit-learn, Matplotlib, and Seaborn within Jupyter Notebooks.

●Applied data preprocessing, feature engineering, and model tuning techniques to enhance model accuracy across regression, classification, and clustering tasks.

●Deployed machine learning models using Microsoft Azure services such as Azure ML and Azure SQL Database for real-time data insights and predictions.

Data Engineer Jan 2025- May 2025

Cantonica New York, NY

●Developed a scalable event tracking system using Apache Kafka and MongoDB for high-throughput data pipelines

●Leveraged containerizing services with Docker to reduce deployment time by 30%.

●Integrated Spark Streaming for real-time data processing, achieving 99.9% service uptime.

●Collaborated with cross-functional teams to optimize data architecture and refine reporting metrics, leading to improvement in data-driven decision-making.

Data Analyst Jul 2021 - Jul 2022

LRA Packaging Remote, India

●Automated ETL refreshes with Python, SQL, and Apache Spark on AWS EMR, cutting extraction to dashboard latency 40 % and enabling near real time KPI tracking.

●Designed Snowflake data marts on Amazon S3 and built reusable SQL views/CTEs, empowering self service analytics in Amazon QuickSight and reducing ad hoc query cycles 35 %.

●Implemented data quality audits via AWS Glue Data Catalog and Lake Formation, tightening validation rules and boosting report accuracy 30 %.

●Analyzed high volume machine telemetry streams through AWS MSK, surfacing downtime trends that lowered unplanned outages 20 %.

●Collaborated with marketing, sales, and operations teams to integrate disparate datasets, enhance dashboard storytelling, and accelerate data driven decisions 25 %.

Education

Masters of Science, Data Science Stevens Institute of Technology, Hoboken, NJ

Bachelors of Science, Physics DG Ruparel College, Mumbai, India

Certifications

●Machine Learning, Coursera: Link

Contact this candidate