Virti Jain Email: ************@*****.***
https://virtirjain09.github.io/ https://www.linkedin.com/in/virti-jain-550020265/ Mobile: +1-934-***-**** EDUCATION
State University of New York at Stony Brook Stony Brook, NY Master of Science in Data Science; GPA: 3.89 Expected Graduation – May 2026 Relevant Courses – Data Analysis, ML in QF, Statistical Learning, Case Study in ML and Finance, NLP Rajiv Gandhi Proudyogiki Vishwavidyalaya Bhopal, India Bachelor of Technology in Computer Science Engineering (Data-Science); GPA: 8.8/10 Dec 2020 – Jun 2024 Relevant Courses – DBMS, Linear Algebra, Probability & Statistics, ML, DL, Data Analytics and Visualization EXPERIENCE
Stony Brook University Stony Brook, New York
Research Project Assistant for Dr. Sharon Nachman’s Lab Feb 2025- Present
Investigated treatment outcomes for 127 patient cases, examining how various factors impacted treatment success. Our analysis uncovered that specific blood markers were strongly linked to the need for additional treatment, while NSAID use and joint inflammation had minimal impact.
Utilizing Python for statistical analysis and hypothesis testing, we created data visualizations that effectively communicated these findings, contributing to more informed treatment decisions. Algoventor Solutions Pvt Ltd. Indore, India
Data Science Intern (Energy Consumption Forecasting for Smart Grids) Jun 2024 – Aug 2024
Developed energy forecasting models using ARIMA and machine learning algorithms, enhancing prediction accuracy by 15% and demonstrating expertise in time-series analysis.
Conducted exploratory data analysis (EDA) on a dataset with 15+ features, identifying key factors influencing grid performance. Streamlined data preprocessing by handling 1000+ missing values and eliminating outliers.
Created visual insights using Seaborn and Matplotlib, facilitating data-driven decision-making and showcasing ability to communicate complex data insights effectively. PROJECTS
Quantitative Financial Forecasting Using Hybrid Machine Learning Models
Engineered a real-time stock market analysis tool in Python, integrating yfinance API for live data retrieval and enabling the analysis of various financial instruments.
Developed interactive visualizations with Plotly, improving user engagement by 20% through an Streamlit interface.
Optimized predictive performance by incorporating XGBoost and a hybrid CNN-LSTM-Random Forest model, achieving an 18% increase in forecast accuracy.
Loan Default Risk Prediction
Constructed a machine learning pipeline to assess loan default probability, applying advanced feature engineering techniques to reduce the feature space by 35%.
Trained logistic regression, random forest, and gradient boosting models to predict loan default risk, optimizing AUC- ROC scores and evaluating model performance using various metrics.
Executed statistical hypothesis tests and EDA to pinpoint high-risk factors, identifying key predictors that raised default likelihood by 40% and providing actionable insights for financial risk management. Statistical Modeling & Financial Data Analysis
Prepared time-series forecasting models (ARIMA, LSTMs), increasing asset price prediction precision by 18% and demonstrating expertise in financial modeling.
Conducted statistical analyses on historical market data, including volatility analysis and correlation analysis, and implemented Monte Carlo simulations to evaluate risk scenarios such as tail risk and drawdowns.
Enhanced risk assessment and reduced unexpected losses by 12% through statistical modeling and data analysis techniques. SKILLS
Programming: Python (Pandas, NumPy, Matplotlib, Tensorflow, scikit-learn), SQL, R, MATLAB
Quantitative Modeling: Statistical analysis, time-series forecasting, Backtesting, Monte Carlo simulations, Machine Learning (Regression, SVM, kNN, K-Means, Random Forest), Deep Learning(CNN, RNN)
Data Tools: Excel, Streamlit, Power BI, Tableau, Looker, MongoDB, Jupyter Notebook, MySQL