Post Job Free
Sign in

Data Analyst and BI Engineer

Location:
Newark, NJ
Posted:
June 20, 2025

Contact this candidate

Resume:

Shantanu Shrikhande

Project Manager

United States 551-***-**** **********************@*****.*** linkedin.com/in/sss3110 https://github.com/shanunited

Summary

AI/ML Data Enthusiast with 4 years of experience building ML pipelines, managing cross-functional AI initiatives, and delivering business-focused solutions. Adept at solving ambiguous problems in fast-paced, mission-driven environments using Python, SQL, and cloud-based tools. Passionate about building data products that improve user experiences and decision-making in healthcare.

Work Experience

ML Engineer Intern Theoremlabs.io Charlotte, North Carolina, USA Oct 2024 – Present

●Defined a modular architecture for LangGraph-based agents, establishing workflows for field detection, rule mapping, and validation, which accelerated release cycles by 40%.

●Automated the identification and mapping of UI behavior to BR attributes by parsing JS files and aligning extracted logic with XML metadata and BR configuration schemas.

●Designed a feedback loop to generate detailed JSON reports for human validation, incorporating domain expert input to refine rule accuracy and feed into a learning knowledge base.

●Implemented modular LangGraph nodes for field detection, BR inference, report generation, and human-in-the-loop validation, enabling continuous rule refinement and knowledge enrichment

●Streamlined business rule migration by translating legacy JavaScript logic into standardized BR definitions, enhancing traceability and regulatory compliance.

Lead Data Analyst New Jersey Institute of Technology Newark, New Jersey, USA Apr 2023 - May 2024

●Led end to end development of 5+ Power BI dashboards to monitor student performance and automate placement decisions, streamlining academic operations for cross-functional academic teams.

●Engineered a structured Excel-based data pipeline supported by Python for cleaning and transforming Mobius scorecard data, cutting manual processing time by 15%.

●Mentored a 7-member analyst team on dashboard building and performance tuning, improving turnaround time on data requests and delivering consistent reporting deliverables.

●Handled 100+ student data-related queries, ensuring policy clarity, and smooth onboarding through resolution mechanisms.

Data Scientist Bayer Pharmaceuticals (Capstone Project) Newark, New Jersey, USA Sep 2023 - Dec 2023

●Designed scalable ETL pipelines in Databricks (PySpark) to integrate multi-source healthcare sales data from Snowflake, enriching provider insights for strategic decision-making.

●Performed exploratory data analysis using Python and Spark to uncover purchase patterns, segment customers, and identify top-selling product combinations.

●Applied unsupervised machine learning (K-Means) to cluster customer behavior and define 10 actionable buyer personas.

●Delivered stakeholder presentations on patient trends and KPIs, enabling data-driven product recommendation in a healthcare setting.

●Developed Power BI dashboards for visualizing KPIs, and presented findings to Bayer leadership to drive strategic decisions.

Application Development Associate Accenture Mumbai, Maharashtra, India Apr 2021 – Aug 2022

●Spearheaded cross-functional coordination between product, engineering, and client stakeholders to implement a profit-maximization engine for ad bidding, improving campaign margin targets (15–45%).

●Utilized DBSCAN clustering to segment heterogeneous bidding behaviors, achieving a 0.75 Silhouette score, enabling targeted bid strategies that stabilized performance at a 20% win-rate threshold.

●Managed concurrent model development and deployment workflows using AWS SageMaker and Lambda, proactively resolving blockers to maintain project timelines and improve inference response time to sub-200ms.

●Built a profit-maximization engine that balanced publisher payout constraints with client-defined profit targets (15–45% range), incorporating feedback loops from win-rate trends to iteratively fine-tune bid logic.

●Integrated model performance monitoring with CloudWatch, reducing anomaly detection lag by 65% and improving ML observability in production.

●Led the end-to-end rollout of A/B testing infrastructure for bidding logic with product teams, improving CTR by 18% and validating uplift via statistical rigor.

●Integrated monitoring and alerting using Prometheus and Grafana to track model performance drift in production, triggering automated retraining via Airflow DAGs.

Technical Skills

Programming languages: Python, SQL, PL/SQL, R, Scala, PySpark

Software Tools: Power BI, Tableau, MySQL, MS SQL Server, PostgreSQL, Snowflake, MS Excel, Power Automate/PowerApps

Data Engineering & Warehousing: Airflow (ETL), NumPy, Pandas, Data Modeling, Data Quality, Feature Engineering

Big Data and Cloud Platform: AWS (S3, Redshift, SageMaker, EC2, Bedrock, Lambda, Quicksight), Google Cloud Platform (BigQuery, Dataflow), Microsoft Azure (Azure ML, Azure Data Factory), Databricks, Hadoop, Apache Spark, Snowflake

GenAI & LLM Stack: Hugging Face Transformers, LoRA/PEFT, LangChain, Llama Index, Prompt Engineering, FAISS, Pinecone

Machine Learning & Statistical Analysis: Predictive Modeling, A/B Testing, Statistical Modeling, Reinforcement Learning, Natural Language Processing (NLP), Large Language Models (LLMs), Neural Networks, Time Series Analysis, XGBoost, Linear Regression, TensorFlow, PyTorch, scikit learn, Keras, CNNs, RNNs, LSTM, GANs, Transfer Learning, Ensemble Models

Education

New Jersey Institute of Technology Aug 2022 – May 2024

MS in Data Science, GPA-3.6 Newark, New Jersey, USA

University of Mumbai Aug 2016 – Oct 2020

BS in Electronics and Telecommunication Engineering Navi Mumbai, Maharashtra, India

Academic Project

Stock Movement Forecasting Using LSTM Phi-data, Groq, Vs code

●Developed LSTM model on 100M+ NASDAQ auction records to forecast stock-index movement, improving prediction accuracy by 18% via time-aware split, median imputation, feature scaling, and model tuning.

Portfolio Optimization Python, Selenium, VS code

●Developed a portfolio optimization model by calculating technical indicators, fitting K-Means clustering on SP500 stocks, and applying Efficient Frontier for maximum Sharpe Ratio, achieving enhanced portfolio returns.

Stock Price Prediction Financial Modeling, DCF Analysis, Ratio Analysis, Excel

●Analyzed 3 quarters of Apple financial data using Excel and DCF modeling, calculating WACC, terminal value, and forecasting stock price at $201.28 vs. actual $198.66, up from $191 at project start.

Future Sales Prediction Python

●Developed a machine learning forecasting model using Random Forest and XGBoost, achieving $2 deviation from actuals and RMSE of 1.42, driven by insights from exploratory sales data analysis.

Face Recognition using Deep Learning Python, Keras, Tensorflow

●Engineered a 9-layer CNN with 120M parameters using TensorFlow and OpenCV to automate attendance, achieving 75% accuracy by matching real-time webcam frames against a pre-trained facial image dataset.

Zillow Data Analytics Python, Airflow, S3, EC2, Lambda, Redshift, Quicksight

●Constructed a Python ETL pipeline on AWS using Airflow (EC2) for orchestration, Lambda & S3 for processing, Redshift for storage, and QuickSight for visualization, enabling automated real estate data analysis and reporting.

U.S. Flight Delay Insights Python, Power BI

●Analyzed 170K+ U.S. flight records using Power BI and Python, leveraging DAX and Power Query to identify delay trends, root causes, and airline performance patterns through interactive dashboards and visual insights.

AI Powered RAG System Langgraph, Groq, Vs code

●Developed an AI-powered document retrieval system, by implementing a LangChain-based RAG pipeline using Groq Llama 3, Chroma DB, and Hugging Face embeddings.

Financial Document Analyzer Langgraph, Groq, Vs code

●Constructed an AI-powered PDF analysis system, by implementing a Streamlit-based financial document chatbot using Gemini AI, FAISS, Langchain and OCR-based text retrieval.

JS to Python Code Converter using Agentic AI Langgraph, Groq, Vs code

●Orchestrated an Agentic AI-driven JavaScript-to-Python code conversion system, as demonstrated by accurate LLM-based translations, by constructing a LangGraph workflow leveraging Groq Llama 3 API for seamless code transformation.



Contact this candidate