Damion Lawson Irving
Brooklyn, NY 347-***-**** ******.******@*****.*** LinkedIn: linkedin.com/in/damionirving GitHub: github.com/damionirving
PROFESSIONAL SUMMARY
Innovative Data Scientist with 10+ years of experience leveraging statistical modeling, machine learning, and data engineering to solve complex business challenges. Extensive expertise in Computer Vision, Large Language Models (LLMs), Generative AI, and AI agents. Proven ability to develop and deploy advanced NLP solutions and machine learning models. Proven expertise in deploying scalable AI solutions, optimizing data pipelines, and developing end-to-end machine learning models. Skilled in Python, SQL, ML, DL, MLOps, and advanced analytics, with a track record of driving data-driven decision-making and collaborating across teams to deliver impactful results to solve complex problems across diverse industries. Developed expertise in RAGAs for multi-doc summarization; compared RAG pipelines vs Model Context Protocol (MCP) driven context integration for scalable protocol-based access.
TECHNICAL SKILLS
Programming Languages and Libraries
Python, NumPy, SciPy, Pandas, Matplotlib, Scikit-learn, PyTorch, TensorFlow, Keras
C++: STL, Boost, MPI
C
SQL
NoSQL: MongoDB, DynamoDB
JavaScript
MATLAB
Streamlit and Plotly
Machine Learning and AI
LLMs: GPT-4o, Claude, Llama, XLM-RoBERTa, BERT, RoBERTa
NLP, RAG, Prompt Engineering, Semantic Search
Deep Learning: CNNs, RNNs, LSTM
Reinforcement Learning: OpenAI Gym, RLHF
Model Calibration: Platt Scaling, Isotonic Regression
Recommendation Systems
Hyperparameter Optimization
Probabilistic Modeling, Statistics, Gaussian Processes
DevOps and MLOps
CI/CD
Docker
Kubernetes
Airflow
MLflow
ETL/ELT Pipelines
REST API and Flask
Celery
Jupyter
Git
Linux Bash
Cloud and Data Engineering
AWS: EC2, Lambda, SageMaker, Bedrock, Glue, EMR, S3, API Gateway, CloudFormation, DynamoDB
Azure: OpenAI, AI Foundry
Snowflake
Spark
Hadoop
Tools and Frameworks
LangChain
LangGraph
LangFlow
Neo4j
Elasticsearch
PostgreSQL
OpenVINO
Mathematics and Statistical Modeling
Regression: Linear, Logistic, Nonlinear, GBM, XGBoost
Classification: SVM, Random Forest, Decision Trees, KNN, Neural Networks
Clustering: K-Means, Hierarchical, Mixture Models
Time Series: AR, ARMA, GARCH, SVR
Optimization: Genetic Algorithms, Simulated Annealing, Particle Swarm
MLE, Monte Carlo, Bootstrapping
Bayesian Experimental Design
Advanced Technical Skills
Probabilistic Graph Models, Bayesian Inference: MCMC, PyMC, MAP, Hidden Markov Models
Hyperparameter tuning, Ensemble Learning, Feature Selection with Metaheuristics
Out-of-core ML, Dynamic Bayesian Networks
WORK EXPERIENCE
CHOREOGRAPH – Director of Data Science (Contract) New York, NY USA Apr 2025 – Jun 2025
Spearheaded the delivery of an ML Federated Learning solution for collaborative learning in a short-term contract engagement
Used Bayesian MMM for channel ROI, adding credible intervals vs. single point estimates
Led discussion and design for enterprise level cloud based Federated Learning solution, provided for: 1 - multiple ML models and 2 - client data privacy: HIPAA, GDPR, etc
This included routing and orchestrating model training and aggregation across multiple data asset silos without transferring actual data, but by moving aggregated model weights
Built a prototype employing all the above, supervised production pipeline build via Snowflake
Tech Stack: Python 3, Scikit-learn, NumPy, Pandas, XGBoost, boto3
VERIZON – Senior Machine Learning Engineer New York, NY, USA Apr 2023 – Aug 2024 Concurrent role with Guardian Life Insurance (Contract)
Engineered and deployed Large Language Models (LLMs) and specialized NLP solutions across cybersecurity operations, enhancing threat detection and response
From late 2023, began using DeepEval to evaluate candidate LLMs, measuring relevance, factual accuracy, and robustness, integrating into CI/CD workflows.
Designed and implemented pre-emptive threat analysis systems using graph databases and Retrieval-Augmented Generation (RAG) for comprehensive, real-time threat intelligence
Leveraged fine-tuning RLHF and LoRA for full performance comparison
Leveraged embeddings and vector databases for accurate retrieval of semantically similar data
Developed and integrated AI Agents to automate and generated actionable threat intelligence
Tech Stack: Python 3, Scikit-learn, NumPy, Pandas, Flask RESTful API, Docker, Swarm, PostgreSQL, boto3, Neo4j, LangChain, LangGraph, OpenAI, Transformers, PyTorch, nltk, AWS Lambda, SageMaker, API Gateway, EC2, Bedrock, Elasticsearch
LLM Models: GPT-4o, GPT-4, XLM-RoBERTa, Llama 3, Anthropic Claude 3
Techniques: Prompt Engineering/Templating, Optimized Cypher Queries, RAG, AI Agents
Data Integration: Aggregated public and proprietary client data sources, including NVD, MITRE, OSINT, and IOC, into a unified graph database
GUARDIAN LIFE INSURANCE – Senior Data Scientist (Contract) New York, NY, USA May 2023 – Aug 2023 Concurrent role with Verizon
Short-term contract engagement to deliver GenAI ChatBot-driven customer journey with documentation
Created innovative health industry claims POCs using Generative AI and NLP models.
Planned and Implemented Large Language model (LLM) solutions across the organization
Used prompt engineering and Retrieval Augmented Generation (RAG) for semantic search
Collaborated with Automation team on Data Science implementation in existing product pipeline
Cloud: AWS - Lambda, API Gateway, Bedrock, DynamoDB, Azure - OpenAI, AI Foundry
Tech Stack: Python 3, Scikit-learn, NumPy, Pandas, Docker, boto3, Azure, OpenAI, Transformers, PyTorch, nltk
DIGITALWARE – Director of Data Science New York, NY USA Jan 2019 – Jun 2022
Led strategy and cross-functional teams (30 percent) developed AI/ML solutions (70 percent)
Developed patented Reinforcement Learning technology to predict cybersecurity attacks
Leveraged advanced Neo4j Cypher queries and visualizations to uncover underlying insights
Developed NLP solution GenAI and ML models to classify cybersecurity vulnerability data
Deployed production models to AWS, on-prem, and Intel Core and Neural Stick architectures
Developed using Intel OpenVino deep learning toolset for edge and IoT deployments
Built computer vision analytics for slip and fall, weapons detection, and real-time Ad delivery
Built facial recognition platform for customer management mitigating $100M annual fraud losses
Gained experience in MLOps development, such as building CI-CD pipelines for model deployment
Programmatically Leveraged GenAI GPT-3 via OpenAI API and HuggingFace RoBERTa (2022)
Tech Stack: Python 3, Scikit-learn, NumPy, Pandas, Flask, Docker, PostgreSQL, boto3, Neo4j, OpenAI Gym, Transformers, PyTorch, TensorFlow
FRAUD.NET – Lead Data Scientist (Contract) New York, NY USA Jul 2018 – Oct 2018
Short-term contract engagement to provide client facing Fraud Detection solution
Developed Deep learning fraud detection models
Developed Cost sensitive fraud detection models that leverage transaction level data
Worked on feature learning methodologies
Introduced and developed high cardinality data representation, using NN embeddings
Built end-to-end Data Science platform supporting rapid development of predictive fraud models
Deployed these models to AWS SageMaker, Lambda, and Kubernetes via Docker
Tech Stack: Python 3, Scikit-learn, NumPy, Pandas, Docker, boto3, TensorFlow
J P MORGAN CHASE – Machine Learning Engineer (Contract) Jersey City, NJ USA Jan 2018 – Jun 2018
Short-term contract engagement to investigate & implement ML to streamline Retail Banking
Worked with team of Data Scientist and Data Engineers toward creating next-level solutions
Evaluated ML and Infrastructure solutions for a portfolio of business problems in banking space
Designed and develop scalable machine learning solutions across different LOBs
Built personalized models using micro-targeting and model stacking of classical ML algorithms
Built and deployed NLP solutions using Word2vec + LSTM, BERT, RoBERTa via TensorFlow
Led detailed product assessment for AWS SageMaker, while working with AWS SMEs
Engineered Machine Learning pipelines to solve business problems using AWS SageMaker
Led Data Protection, ML modelling, and Public Cloud implementation best practices
Worked on Hadoop query optimization moving from Hive SQL to Spark SQL
Tech Stack: Python 3, Scikit-learn, NumPy, Pandas, Docker, PostgreSQL, MongoDB, boto3, nltk, TensorFlow
ARGO GROUP – Senior Data Scientist (Contract) New York, NY USA May 2017 – Nov 2017
Developed Natural Language Understanding Algorithms for Insurtech applications
Used Statistical Modeling, NLP, and Machine Learning algorithms in end-to-end applications
Developed AWS Big Data Pipeline for ETL and NLP algorithms using Serverless design
Leveraged AWS: Lambda, EC2, ECS, Docker Containers, Elastic Beanstalk, Elastic Load balancer, RDS, Elastic Search, SNS, SQS, CloudWatch, Spark via EMR
Built near real-time pipeline using Lambda, Step-functions, for processing financial documents for NLP and NLU: Sentiment, classification, topic extraction
Tech Stack: Python 3, Scikit-learn, NumPy, Pandas, Docker, PostgreSQL, boto3, nltk, TensorFlow.
SCIENAPTIC SYSTEMS – Lead Data Scientist (Contract) New York, NY USA Mar 2017 – Apr 2017
Short-term contract engagement to deliver ML-based Fraud Detection
Developed compliance ready Credit-card-kiting predictive model using consumer credit data
Worked on ETL and pipelining into pre-processing, and analytics using Apache Spark
Used ensemble methods for feature selection on large feature set
Used appropriate sampling methods, based on domain knowledge
Built and validated predictive models (Linear Regression, Random Forest, GBM, Deep Learning)
Achieved low capture-rate, via post-model optimization FPR(0.18 percent), RCL(94 percent), ACC(99.8 percent)
Contributed to large scale anomaly detection for banking client Barclays
Tech Stack: Python 3, Scikit-learn, NumPy, Pandas, Docker
MESSINA QUANTITATIVE RESEARCH – Data Scientist New York, NY USA Nov 2014 – Jun 2016
Developed statistical and ML political forecasting models that outperformed traditional polls
Predictive models provided actionable insights into market patterns and consumer preferences
Advocated microservices architecture and standardized DevOps practices
Built a scalable IPython Cluster for CPU-intensive tasks on heterogeneous VMs
Built infrastructure for automation of routing, preprocessing, model execution, and DB
Built MLOps-ready ETL/ELT and CI/CD flows for fast model training and deployment cycles
Contributed to survey design and integrated sampling in a decision-theoretic framework
Leveraged R, Python, and C++ to implement statistical tools, ML models, and scalable systems
Communicated technical concepts to non-technical teams, bridging technical and business goals
Tech Stack: Python 3, Scikit-learn, NumPy, Pandas, Docker
PROFESSIONAL DEVELOPMENT & TRAINING
AI RESEARCH & DEVELOPMENT – Generative AI Self-Directed Study New York, NY USA Sep 2024 – Mar 2025
Dedicated sabbatical to mastering GenAI, LLM, and RAG solutions, with a focus on applied research and end-to-end prototype development
Built production-grade applications leveraging GPT-4o, Claude, and Llama for NLP, and AI agent use cases
Designed and implemented RAG pipelines using LangChain, vector embeddings, and advanced prompt engineering techniques
Integrated AWS Bedrock, SageMaker, and Lambda to deploy scalable, cloud-based AI solutions across various domains
AI & DATA SCIENCE PROFESSIONAL DEVELOPMENT – Self-directed Study New York, NY Jul 2022 – Mar 2023
Completed intensive upskilling in Generative AI and LLM fine-tuning
Enhanced skills in AWS SageMaker, and cloud-based deployment strategies
Developed prototype NLP applications using GPT-3, Claude, and Llama for semantic search and summarization
Introduced to Computer Vision Diffusion models
PUBLICATIONS
Y. Liu, D. Irving, W. Qiao, D. Ge, R. Levicky, Kinetic Mechanisms in Morpholino-DNA Surface Hybridization, Journal of the American Chemical Society 2011, 133, 115**-*****
D. Irving, P. Gong, R. Levicky, DNA Surface Hybridization: Comparison of Theory and Experiment, Journal of Physical Chemistry B 2010, 114, 7631-7640
PATENTS
Damion Irving, James Korge, Jeffrey L. Thomas, and Robert Bathurst. Graphics processing unit optimization. U.S. Patent No. US20230032249A1
Damion Irving, James Korge, Jeffrey L. Thomas, and Robert Bathurst. Systems and methods for providing data privacy using federated learning. U.S. Patent No. US20230056706A1
Damion Irving, James Korge, Jeffrey L. Thomas, and Robert Bathurst. Systems and methods for applying reinforcement learning to cybersecurity graphs. U.S. Patent No. US20230034303A1
PRESENTATIONS
D. Irving, R. Levicky, DNA surface hybridization: An overview of theory, experiment, and modeling, Clorox Research and Development Center, Pleasanton, CA, October 5 2012
EDUCATION
Ph.D. Chemical Engineering, NYU Tandon School of Engineering, Brooklyn, NY (Jun 2012) All But Dissertation
Dissertation Title: Surface-Based DNA Hybridization Kinetics: Experiment and Modeling
Developed and co-authored biophysical electrostatic equilibrium model for surface hybridization
Developed kinetic model for surface hybridization incorporating electrostatics and probe interactions
Estimated kinetic parameters through non-linear regression of ODE and DAE using C++ and MATLAB
B. S. Chemical Engineering, Polytechnic University, Brooklyn, NY (Jan 2006)
LANGUAGE
English — Native / Fluent (Spoken & Written)