Post Job Free
Sign in

Machine Learning Data Scientist

Location:
Brooklyn, NY
Posted:
September 17, 2025

Contact this candidate

Resume:

Damion Lawson Irving

Brooklyn, NY 347-***-**** ******.******@*****.*** LinkedIn: linkedin.com/in/damionirving GitHub: github.com/damionirving

PROFESSIONAL SUMMARY

Innovative Data Scientist with 10+ years of experience leveraging statistical modeling, machine learning, and data engineering to solve complex business challenges. Extensive expertise in Computer Vision, Large Language Models (LLMs), Generative AI, and AI agents. Proven ability to develop and deploy advanced NLP solutions and machine learning models. Proven expertise in deploying scalable AI solutions, optimizing data pipelines, and developing end-to-end machine learning models. Skilled in Python, SQL, ML, DL, MLOps, and advanced analytics, with a track record of driving data-driven decision-making and collaborating across teams to deliver impactful results to solve complex problems across diverse industries. Developed expertise in RAGAs for multi-doc summarization; compared RAG pipelines vs Model Context Protocol (MCP) driven context integration for scalable protocol-based access.

TECHNICAL SKILLS

Programming Languages and Libraries

Python, NumPy, SciPy, Pandas, Matplotlib, Scikit-learn, PyTorch, TensorFlow, Keras

C++: STL, Boost, MPI

C

SQL

NoSQL: MongoDB, DynamoDB

JavaScript

MATLAB

Streamlit and Plotly

Machine Learning and AI

LLMs: GPT-4o, Claude, Llama, XLM-RoBERTa, BERT, RoBERTa

NLP, RAG, Prompt Engineering, Semantic Search

Deep Learning: CNNs, RNNs, LSTM

Reinforcement Learning: OpenAI Gym, RLHF

Model Calibration: Platt Scaling, Isotonic Regression

Recommendation Systems

Hyperparameter Optimization

Probabilistic Modeling, Statistics, Gaussian Processes

DevOps and MLOps

CI/CD

Docker

Kubernetes

Airflow

MLflow

ETL/ELT Pipelines

REST API and Flask

Celery

Jupyter

Git

Linux Bash

Cloud and Data Engineering

AWS: EC2, Lambda, SageMaker, Bedrock, Glue, EMR, S3, API Gateway, CloudFormation, DynamoDB

Azure: OpenAI, AI Foundry

Snowflake

Spark

Hadoop

Tools and Frameworks

LangChain

LangGraph

LangFlow

Neo4j

Elasticsearch

PostgreSQL

OpenVINO

Mathematics and Statistical Modeling

Regression: Linear, Logistic, Nonlinear, GBM, XGBoost

Classification: SVM, Random Forest, Decision Trees, KNN, Neural Networks

Clustering: K-Means, Hierarchical, Mixture Models

Time Series: AR, ARMA, GARCH, SVR

Optimization: Genetic Algorithms, Simulated Annealing, Particle Swarm

MLE, Monte Carlo, Bootstrapping

Bayesian Experimental Design

Advanced Technical Skills

Probabilistic Graph Models, Bayesian Inference: MCMC, PyMC, MAP, Hidden Markov Models

Hyperparameter tuning, Ensemble Learning, Feature Selection with Metaheuristics

Out-of-core ML, Dynamic Bayesian Networks

WORK EXPERIENCE

CHOREOGRAPH – Director of Data Science (Contract) New York, NY USA Apr 2025 – Jun 2025

Spearheaded the delivery of an ML Federated Learning solution for collaborative learning in a short-term contract engagement

Used Bayesian MMM for channel ROI, adding credible intervals vs. single point estimates

Led discussion and design for enterprise level cloud based Federated Learning solution, provided for: 1 - multiple ML models and 2 - client data privacy: HIPAA, GDPR, etc

This included routing and orchestrating model training and aggregation across multiple data asset silos without transferring actual data, but by moving aggregated model weights

Built a prototype employing all the above, supervised production pipeline build via Snowflake

Tech Stack: Python 3, Scikit-learn, NumPy, Pandas, XGBoost, boto3

VERIZON – Senior Machine Learning Engineer New York, NY, USA Apr 2023 – Aug 2024 Concurrent role with Guardian Life Insurance (Contract)

Engineered and deployed Large Language Models (LLMs) and specialized NLP solutions across cybersecurity operations, enhancing threat detection and response

From late 2023, began using DeepEval to evaluate candidate LLMs, measuring relevance, factual accuracy, and robustness, integrating into CI/CD workflows.

Designed and implemented pre-emptive threat analysis systems using graph databases and Retrieval-Augmented Generation (RAG) for comprehensive, real-time threat intelligence

Leveraged fine-tuning RLHF and LoRA for full performance comparison

Leveraged embeddings and vector databases for accurate retrieval of semantically similar data

Developed and integrated AI Agents to automate and generated actionable threat intelligence

Tech Stack: Python 3, Scikit-learn, NumPy, Pandas, Flask RESTful API, Docker, Swarm, PostgreSQL, boto3, Neo4j, LangChain, LangGraph, OpenAI, Transformers, PyTorch, nltk, AWS Lambda, SageMaker, API Gateway, EC2, Bedrock, Elasticsearch

LLM Models: GPT-4o, GPT-4, XLM-RoBERTa, Llama 3, Anthropic Claude 3

Techniques: Prompt Engineering/Templating, Optimized Cypher Queries, RAG, AI Agents

Data Integration: Aggregated public and proprietary client data sources, including NVD, MITRE, OSINT, and IOC, into a unified graph database

GUARDIAN LIFE INSURANCE – Senior Data Scientist (Contract) New York, NY, USA May 2023 – Aug 2023 Concurrent role with Verizon

Short-term contract engagement to deliver GenAI ChatBot-driven customer journey with documentation

Created innovative health industry claims POCs using Generative AI and NLP models.

Planned and Implemented Large Language model (LLM) solutions across the organization

Used prompt engineering and Retrieval Augmented Generation (RAG) for semantic search

Collaborated with Automation team on Data Science implementation in existing product pipeline

Cloud: AWS - Lambda, API Gateway, Bedrock, DynamoDB, Azure - OpenAI, AI Foundry

Tech Stack: Python 3, Scikit-learn, NumPy, Pandas, Docker, boto3, Azure, OpenAI, Transformers, PyTorch, nltk

DIGITALWARE – Director of Data Science New York, NY USA Jan 2019 – Jun 2022

Led strategy and cross-functional teams (30 percent) developed AI/ML solutions (70 percent)

Developed patented Reinforcement Learning technology to predict cybersecurity attacks

Leveraged advanced Neo4j Cypher queries and visualizations to uncover underlying insights

Developed NLP solution GenAI and ML models to classify cybersecurity vulnerability data

Deployed production models to AWS, on-prem, and Intel Core and Neural Stick architectures

Developed using Intel OpenVino deep learning toolset for edge and IoT deployments

Built computer vision analytics for slip and fall, weapons detection, and real-time Ad delivery

Built facial recognition platform for customer management mitigating $100M annual fraud losses

Gained experience in MLOps development, such as building CI-CD pipelines for model deployment

Programmatically Leveraged GenAI GPT-3 via OpenAI API and HuggingFace RoBERTa (2022)

Tech Stack: Python 3, Scikit-learn, NumPy, Pandas, Flask, Docker, PostgreSQL, boto3, Neo4j, OpenAI Gym, Transformers, PyTorch, TensorFlow

FRAUD.NET – Lead Data Scientist (Contract) New York, NY USA Jul 2018 – Oct 2018

Short-term contract engagement to provide client facing Fraud Detection solution

Developed Deep learning fraud detection models

Developed Cost sensitive fraud detection models that leverage transaction level data

Worked on feature learning methodologies

Introduced and developed high cardinality data representation, using NN embeddings

Built end-to-end Data Science platform supporting rapid development of predictive fraud models

Deployed these models to AWS SageMaker, Lambda, and Kubernetes via Docker

Tech Stack: Python 3, Scikit-learn, NumPy, Pandas, Docker, boto3, TensorFlow

J P MORGAN CHASE – Machine Learning Engineer (Contract) Jersey City, NJ USA Jan 2018 – Jun 2018

Short-term contract engagement to investigate & implement ML to streamline Retail Banking

Worked with team of Data Scientist and Data Engineers toward creating next-level solutions

Evaluated ML and Infrastructure solutions for a portfolio of business problems in banking space

Designed and develop scalable machine learning solutions across different LOBs

Built personalized models using micro-targeting and model stacking of classical ML algorithms

Built and deployed NLP solutions using Word2vec + LSTM, BERT, RoBERTa via TensorFlow

Led detailed product assessment for AWS SageMaker, while working with AWS SMEs

Engineered Machine Learning pipelines to solve business problems using AWS SageMaker

Led Data Protection, ML modelling, and Public Cloud implementation best practices

Worked on Hadoop query optimization moving from Hive SQL to Spark SQL

Tech Stack: Python 3, Scikit-learn, NumPy, Pandas, Docker, PostgreSQL, MongoDB, boto3, nltk, TensorFlow

ARGO GROUP – Senior Data Scientist (Contract) New York, NY USA May 2017 – Nov 2017

Developed Natural Language Understanding Algorithms for Insurtech applications

Used Statistical Modeling, NLP, and Machine Learning algorithms in end-to-end applications

Developed AWS Big Data Pipeline for ETL and NLP algorithms using Serverless design

Leveraged AWS: Lambda, EC2, ECS, Docker Containers, Elastic Beanstalk, Elastic Load balancer, RDS, Elastic Search, SNS, SQS, CloudWatch, Spark via EMR

Built near real-time pipeline using Lambda, Step-functions, for processing financial documents for NLP and NLU: Sentiment, classification, topic extraction

Tech Stack: Python 3, Scikit-learn, NumPy, Pandas, Docker, PostgreSQL, boto3, nltk, TensorFlow.

SCIENAPTIC SYSTEMS – Lead Data Scientist (Contract) New York, NY USA Mar 2017 – Apr 2017

Short-term contract engagement to deliver ML-based Fraud Detection

Developed compliance ready Credit-card-kiting predictive model using consumer credit data

Worked on ETL and pipelining into pre-processing, and analytics using Apache Spark

Used ensemble methods for feature selection on large feature set

Used appropriate sampling methods, based on domain knowledge

Built and validated predictive models (Linear Regression, Random Forest, GBM, Deep Learning)

Achieved low capture-rate, via post-model optimization FPR(0.18 percent), RCL(94 percent), ACC(99.8 percent)

Contributed to large scale anomaly detection for banking client Barclays

Tech Stack: Python 3, Scikit-learn, NumPy, Pandas, Docker

MESSINA QUANTITATIVE RESEARCH – Data Scientist New York, NY USA Nov 2014 – Jun 2016

Developed statistical and ML political forecasting models that outperformed traditional polls

Predictive models provided actionable insights into market patterns and consumer preferences

Advocated microservices architecture and standardized DevOps practices

Built a scalable IPython Cluster for CPU-intensive tasks on heterogeneous VMs

Built infrastructure for automation of routing, preprocessing, model execution, and DB

Built MLOps-ready ETL/ELT and CI/CD flows for fast model training and deployment cycles

Contributed to survey design and integrated sampling in a decision-theoretic framework

Leveraged R, Python, and C++ to implement statistical tools, ML models, and scalable systems

Communicated technical concepts to non-technical teams, bridging technical and business goals

Tech Stack: Python 3, Scikit-learn, NumPy, Pandas, Docker

PROFESSIONAL DEVELOPMENT & TRAINING

AI RESEARCH & DEVELOPMENT – Generative AI Self-Directed Study New York, NY USA Sep 2024 – Mar 2025

Dedicated sabbatical to mastering GenAI, LLM, and RAG solutions, with a focus on applied research and end-to-end prototype development

Built production-grade applications leveraging GPT-4o, Claude, and Llama for NLP, and AI agent use cases

Designed and implemented RAG pipelines using LangChain, vector embeddings, and advanced prompt engineering techniques

Integrated AWS Bedrock, SageMaker, and Lambda to deploy scalable, cloud-based AI solutions across various domains

AI & DATA SCIENCE PROFESSIONAL DEVELOPMENT – Self-directed Study New York, NY Jul 2022 – Mar 2023

Completed intensive upskilling in Generative AI and LLM fine-tuning

Enhanced skills in AWS SageMaker, and cloud-based deployment strategies

Developed prototype NLP applications using GPT-3, Claude, and Llama for semantic search and summarization

Introduced to Computer Vision Diffusion models

PUBLICATIONS

Y. Liu, D. Irving, W. Qiao, D. Ge, R. Levicky, Kinetic Mechanisms in Morpholino-DNA Surface Hybridization, Journal of the American Chemical Society 2011, 133, 115**-*****

D. Irving, P. Gong, R. Levicky, DNA Surface Hybridization: Comparison of Theory and Experiment, Journal of Physical Chemistry B 2010, 114, 7631-7640

PATENTS

Damion Irving, James Korge, Jeffrey L. Thomas, and Robert Bathurst. Graphics processing unit optimization. U.S. Patent No. US20230032249A1

Damion Irving, James Korge, Jeffrey L. Thomas, and Robert Bathurst. Systems and methods for providing data privacy using federated learning. U.S. Patent No. US20230056706A1

Damion Irving, James Korge, Jeffrey L. Thomas, and Robert Bathurst. Systems and methods for applying reinforcement learning to cybersecurity graphs. U.S. Patent No. US20230034303A1

PRESENTATIONS

D. Irving, R. Levicky, DNA surface hybridization: An overview of theory, experiment, and modeling, Clorox Research and Development Center, Pleasanton, CA, October 5 2012

EDUCATION

Ph.D. Chemical Engineering, NYU Tandon School of Engineering, Brooklyn, NY (Jun 2012) All But Dissertation

Dissertation Title: Surface-Based DNA Hybridization Kinetics: Experiment and Modeling

Developed and co-authored biophysical electrostatic equilibrium model for surface hybridization

Developed kinetic model for surface hybridization incorporating electrostatics and probe interactions

Estimated kinetic parameters through non-linear regression of ODE and DAE using C++ and MATLAB

B. S. Chemical Engineering, Polytechnic University, Brooklyn, NY (Jan 2006)

LANGUAGE

English — Native / Fluent (Spoken & Written)



Contact this candidate