Post Job Free
Sign in

Data Analyst, Machine Learning Engineer

Location:
Detroit, MI, 48226
Posted:
October 02, 2025

Contact this candidate

Resume:

SOURAV AGRAWAL

LinkedIn 248-***-**** *****************@*****.*** GitHub

EDUCATION MS in Data Science, State University of New York At Buffalo (GPA: 3.7/4) Dec 2024 Bachelor of Computer Applications, St Xavier’s College, India Apr 2021 SKILLS Programming Languages and Databases: Python, R, SQL, MySQL, Oracle Machine Learning: Supervised/Unsupervised Learning, NLP, Time-Series Analysis, Neural Networks, CNN, SVM, LightGBM Cloud & Tools: Jupyter Notebooks, RStudio, GitHub, Overleaf, Google Colab, MS Office, Power BI, AWS EXPERIENCE Data Scientist at BNMC, Buffalo, NY Feb 2025 – Current

• Build predictive models to minimize machine downtime by 20% and reduce operational costs by 5%. Implemented real-time data retrieval from sensors and a processing pipeline using Socket.io and Node.js.

• Designed and fine-tuned Artificial Neural Networks and 1D CNNs, achieving over 90% accuracy.

• Worked in a highly cross-functional and agile environment, collaborating with vibration experts & engineers for incremental delivery. Led daily scrum calls to track progress & address blockers. Research & Data Analyst at S.I.T.T.Y, Sacramento, CA May 2025 – July 2025

• Led a citywide transportation equity analysis using Python, GeoPandas, and GIS tools across multiple census tracts.

• Built predictive models for EV adoption and transit usage, achieving 90% accuracy against survey ground truth.

• Built a semi-automated Python pipeline to fetch and integrate data via API calls (U.S. Census, NREL EV charging, GTFS transit) and standardize all spatial data to a common CRS.

• Developed composite scoring models and performed cost-benefit analysis to identify high-priority neighborhoods for clean mobility investments, optimizing resource allocation for maximum impact.

• Created interactive dashboards enabling policymakers to target underserved communities effectively.

• Collaborated with cross-functional teams to design and prepare for community survey validation and hypothesis testing protocols.

Data Analyst at Interactive Manpower Solutions, India Mar 2022 – Aug 2023

• Analyzed complex healthcare data for external clients and integrated data from multiple sources to generate financial reports, helping them track revenue, outstanding payments, and cash flow trends, resulting in a 20% reduction in accounts receivable aging.

• Developed financial dashboards using Power BI, improving the visibility of key financial metrics.

• Extensively leverage complex SQL queries to extract, manipulate, and analyze large datasets. Optimized SQL queries for performance, reducing data retrieval times by 25% while handling large datasets.

• Wrote complex CTEs, window functions, and stored procedures to support advanced financial and operational analytics.

• Utilized pivot tables, VLOOKUP, PowerQuery, and other advanced Excel features to automate financial reporting, saving 8+ hours per week.

• Partnered with cross-functional teams to align data reporting with business objectives. PROJECTS AI-Powered Resume Ranking and Screening System EasyRecruit A full-stack web application designed to automate initial recruitment stages by intelligently scoring and ranking candidate resumes against job descriptions, significantly reducing manual screening time. Tech Stack: Python, Flask, React.js, Google Gemini API, NLP, SQLAlchemy, PostgreSQL, Render, Netlify, Git

• Engineered an advanced NLP pipeline using Google's Gemini Pro API for context-aware entity extraction, improving keyword identification by an estimated 40% over traditional matching and increasing the accuracy of candidate-to-job description scoring.

• Developed a robust Flask backend and REST API to process and score resumes, implementing a weighted algorithm that provided data-driven candidate rankings to reduce manual screening time by an estimated 75%.

• Architected and deployed the end-to-end application on a cloud platform, utilizing Render for the containerized backend and Netlify for the frontend, establishing a CI/CD pipeline that ensured high availability for concurrent users. Home Energy Consumption Forecasting & Optimization Developed a predictive model to forecast and optimize energy consumption in smart homes using time-series analysis and advanced machine learning models to identify potential cost savings. Tech Stack: Python, NumPy, Pandas, Scikit-learn, LightGBM, Matplotlib, Seaborn

• Performed in-depth Exploratory Data Analysis (EDA) on time-series data to identify appliance-specific energy patterns and the impact of weather, engineering key features to improve model performance.

• Trained and evaluated multiple forecasting models, achieving superior accuracy with LightGBM (97.6% for high-load appliances), demonstrating the potential to reduce household energy costs by up to 15% through optimized usage schedules.

Breast Cancer Tumor Classification

Developed and compared multiple ML models to classify breast cancer tumors, with a focus on implementing a Support Vector Machine (SVM) from scratch to demonstrate a foundational understanding of optimization algorithms. Tech Stack: Python, NumPy, Pandas, Scikit-learn, CVXPY

• Implemented and benchmarked Decision Tree, Naïve Bayes, and SVM to classify tumors, systematically evaluating performance to identify the most effective model for the dataset.

• Engineered a unique Support Vector Machine (SVM) from the ground up using Python and the CVXPY optimization library, achieving a final classification accuracy of 95.61% and showcasing a deep understanding of core machine learning principles.



Contact this candidate