Minghai (Timothy) Li, --Ph.D. in Physics
Boston, MA 02215 • 617-***-**** • *******.***********@*****.*** • US citizen Linkedin: https://www.linkedin.com/in/minghai-tim-li-8618221a/ Summary: Principal Data Scientist with 10 years of experience delivering end-to-end machine learning solutions
—from data acquisition and feature engineering through model design, validation, optimization, deployment, and drift monitoring. Key projects include leading a pricing optimization system at Citizens Bank, yielding $500k annual savings, and creating ML-driven equity execution systems at Fidelity Investments, saving $30M/year trading cost. Proven track record delivering production-grade ML systems in finance including pricing optimization, equity automated execution engines, anomaly detection, fraud detection, propensity modeling, pricing elasticity modeling, CECL modeling, risk modeling, Market Mix Modeling, and portfolio performance analytics. Relevant Skills:
10 years’ experience in finance/consultant/tech (plus 8 years in academic research &2 years leadership) in machine learning, deep learning, reinforcement learning, NLP, statistical inference &modeling, Bayesian optimization, data mining/data analysis/data engineering, signal processing, time-series forecasting.
Proficient in object-oriented programming and scientific computing with Python (NumPy, Pandas, Polars, Scikit-learn, Matplotlib), R, SQL, SAS, TensorFlow/Keras/PyTorch, H2O.ai, DevOps tools (Git), MLops, and major cloud platform (AWS SageMaker, Google Cloud, and Azure). Professional Experiences:
Principal Data Scientist (VP), Citizens Bank, Boston, MA, 05/2023-12/2025
Led the development of a new loan pricing optimization (LPO) system for the Education Refinance Loan
(ERL) portfolio, replacing the legacy FICO LPO platform and eliminating $500K in annual subscription cost. partnered closely with pricing team and finance stakeholders to ensure the solution aligned with business strategy and operational needs.
This LPO implements a highly complex optimization engine that takes pricing elasticity model, volume- foretasting, loan Profit &Loss model, and integrates regulatory, risk, profitability, and competitive constraints into a unified framework. The system optimizes high-dimensional, interdependent parameters to generate stable, scenario-specific pricing solutions, supports diverse business strategies and improves the return–volume tradeoff while running efficiently on desktop-level compute (~10 minutes). Its modular architecture allows easy extension to additional loan products.
Developed end-to-end ML solutions—from data extraction and feature engineering to model training, validation, deployment, and drift monitoring—for loan-pricing elasticity, credit-card delinquency, and ERL retention (Logistic Regression, XGBoost), improving marketing strategies and strengthening risk management. Manage Data Science, Publicis Sapient AI labs (Consultant), Boston, MA, 06/2021-02/2023
Built CatBoost models to infer demographic attributes from noisy, unstructured inputs—handwriting, scanned images, PDFs—directly supporting a US government agency’s consumer product safety report data quality initiative and collaborating closely with policy, compliance, and technical stakeholders to align model outputs with operational needs.
2
Developed CatBoost time-series models for customer cash-flow forecasting for a major U.S. bank, partnering with product, risk, and finance teams to translate modeling insights into actionable improvements for customer financial planning and risk management.
Built Market Mix Models to evaluate and optimize customer incentive programs for a national retailer; communicated findings to marketing and executive teams, enabling data-driven pricing decisions and improved promotional effectiveness.
Sr. Data Scientist, Fidelity Investments, Boston, MA, 06/2019-03/2021
Built core models for an automated equity execution system under noisy market conditions. Because each trade provides only one realized execution path, but the optimizer must evaluate 1,000+ counterfactual paths, designed a two-model framework: a rich pre-trade model and a generalizable post-trade model to estimate costs across all hypothetical strategies. Integrated into a cloud-based optimizer that delivered ~$30M/year savings (~1 bps) over the legacy system.
Ensured model performance under evolving market conditions by monitoring drift and recalibrating models, working closely with data engineering and business teams to deliver reliable insights that support customer- focused decision-making.
Sr. Data Scientist (Scrum Master), NetBrain Tech. Inc., Burlington, MA 04/2018-06/2019
Led a team of five data scientists and collaborated closely with cross-functional stakeholders to design, build, validate, and deploy full-stack ML models for network anomaly detection. Delivered significant improvements in detection accuracy using LSTM, CNN, ARIMA, Holt-Winters, PCA, and statistical modeling, while mentoring junior staff and establishing best practices for model monitoring and drift management.
Implemented word embedding and CNN, RNN to analyze raw network text logs, enhancing anomaly detection and increasing troubleshooting efficiency.
Constructed correlation-analysis frameworks that facilitated faster root cause identification and improved network troubleshooting capabilities.
Developed a multi-modal cross-devices anomaly detection system using Isolation Forest and statistical methods, which increased AUC by ~20% and broadened detection coverage in a simulated network environment. Sr. Data Scientist, Fidelity National Information Services (FIS), Burlington, MA 06/2016-03/2018
Developed and delivered full-stack fraud detection systems for ACH payments using logistic regression and AdaBoost, which enhanced fraud detection accuracy and reduced false positives.
Applied advanced signal processing techniques—including Kalman filtering, wavelet transforms, Fourier spectral analysis, ARIMA modeling, and time-series decomposition—to extract robust fraud-detection signals from noisy customer data, enabling higher-precision ML classification with Platt-scaled probabilities.
Invented “influence diagrams” to visualize real AdaBoost model behavior, diagnose overfitting dimensions, and guide training improvements—leading to ~100% increased fraud dollar detection. Work accepted for presentation at ODSC East 2018.
Chief Data Scientist (Part-time), AInvest -AI Quant investment start-up, Quincy, MA 04/2017-06/2019 3
Led a team of data scientists building portfolio return prediction models using LSTM, LightGBM, Random Forest, XGBoost, CatBoost, and Bayesian Optimization.
Deployed models live on AWS; achieved ~27% annual alpha return with a Sharpe ratio of ~2.8 on the China A-share market over two years.
Academic Experiences:
Researcher Associate, Physics Dept, Northeastern University, Boston, MA. 04/2014-03/2016
Worked on computer modeling and MD simulations on multi-scale biological complex system; Probed the kinetic properties of a complex system using statistics analysis, modeling, physics theories. Researcher Associate, School of Chemistry, Clark University, Worcester, MA. 07/2010-12/2013
Worked on computer modeling and MD simulation on protein folding.
Developed an algorithm for searching optimal reaction pathways of protein and RNA folding using Max-Flux algorithm, graph theory, Dijkstra's algorithm, and statistics analysis.
Proposed a novel dimension reduction algorithm, a energy-based-kernel diffusion map to improve embedding accuracy for complex system.
Research Associate, Division of MSE, Boston University, Boston, MA. 04/2008-04/2010
Studied electronic and mechanical properties of conducting polymers using first-principal methods such as DFT, Hartree-Fock, SSH, and discovered a novel actuator mechanism of conducting polymer.
Proposed a novel global optimization algorithm, so-called Self-learning Metabasin Escape Algorithm for the slow-dynamic study on glass systems.
Education:
Ph.D. in physics (polymer physics: computational & experimental), Boston University 05/2008
M.S. in physics (semiconductor, Nano-material), Nanjing University 07/2001
B.S. in physics (semiconductor), Lanzhou University 07/1998
*For a list of 38 peer-reviewed journal publications see https://www.researchgate.net/profile/Minghai-Li-2