Post Job Free
Sign in

AI/ML Engineer with Cloud & NLP Expertise

Location:
United States
Salary:
80000
Posted:
February 12, 2026

Contact this candidate

Resume:

Sai Srikar Dabbukottu

*********.****@*****.*** 602-***-**** USA LinkedIn

Summary

AI/ML Engineer with 4 years of experience designing, developing, and deploying machine learning and NLP solutions. Skilled in building scalable ETL pipelines, fine-tuning deep learning and transformer models, and implementing cloud-based, production-ready systems using AWS, Azure, and containerized workflows. Adept at optimizing models, ensuring compliance, and delivering reliable, high-impact AI and data-driven solutions. Technical Skills

• Programming & Scripting: Python (pandas, BeautifulSoup, spaCy, scikit-learn), SQL, PySpark, R, Julia, Bash, Shell scripting, C++

• Machine Learning & Deep Learning: TensorFlow, PyTorch, Keras, XGBoost, LightGBM, CatBoost, Autoencoders, Logistic Regression, CNN, RNN, LSTM, GANs, GNNs, Transformers, Transfer Learning, Bayesian Optimization, k-Fold Cross-Validation, Hyperopt

• NLP & LLM Technologies: BERT, DeBERTa, finance-specific LLMs, Hugging Face (Transformers, Accelerate), PEFT (LoRA), DeepSpeed, NLTK, spaCy, TF- IDF, NER, Sentiment Analysis, Text Classification, Keyphrase Extraction, Clinical & Financial NLP

• Data Engineering & Big Data: Apache Airflow, Azure Data Factory, AWS Glue, PySpark, Hadoop, Dask, Kafka, Real-time ETL, SLA-driven pipelines

• Cloud Platforms & DevOps: AWS (SageMaker, Lambda, EC2, API Gateway, EKS), Azure (ML Studio, AKS), Docker, Kubernetes, Terraform, gRPC, Containerization, Azure Databricks, AWS Elastic Beanstalk, CI/CD Pipelines, Service Mesh, API Orchestration

• MLOps, Monitoring & Model Management: MLflow, CI/CD (GitLab, GitHub), Jenkins, Prometheus, Grafana, Drift Detection, Model Registry, Automated Retraining Pipelines

• Databases, Storage & APIs: PostgreSQL, MongoDB, SQL/NoSQL, REST APIs, Microservices Architecture, Redis

• Security, Compliance & Explainability: SOC 2, PCI DSS, Data Privacy Standards, SHAP, LIME, Model Interpretability, Regulatory Audit Support Professional Experience

AI/ML Engineer, Plaid Inc. 07/2024 – Present Remote, USA

• Developed AI-powered transaction analysis and financial insights platform, improving API-driven account aggregation, predictive spending alerts, and fraud detection precision by 18%, collaborating with engineering, product, and compliance teams to optimize fintech-specific user experience.

• Built scalable ETL pipelines using Airflow, Python, and AWS Glue to process high-volume banking, card, and transaction data, ensuring data integrity, freshness, and regulatory compliance, reducing ETL processing errors by 25% across Plaid’s services.

• Engineered user and transaction embeddings with pandas, scikit-learn, and spaCy to model spending behavior, merchant risk, and cross-account transaction patterns, reducing anomaly detection false positives by 32% and enhancing personalized recommendations by 22%.

• Fine-tuned transformer-based models (BERT, DeBERTa, finance-specific LLMs) on SageMaker using enriched transaction metadata, API logs, and customer data, boosting predictive insight relevance (NDCG@10 improved from 0.54 to 0.80) and recommendation accuracy by 20%.

• Applied PEFT methods like LoRA via Hugging Face Accelerate and DeepSpeed to adapt LLMs for Plaid use cases, including automated transaction categorization and dynamic expense summarization, reducing manual review time by 35% while maintaining SOC 2 and PCI DSS compliance.

• Partnered with DevOps to containerize AI services using Docker, deploying on AWS EKS with Plaid’s API gateway, creating SLA dashboards and audit documentation, improving system uptime by 15% and deployment efficiency by 28%. AI/ML Engineer, UnitedHealth Group 01/2021 – 07/2023 Bengaluru, India

• Developed an Intelligent Healthcare Risk Prediction Platform leveraging machine learning to detect high-risk patients and claims anomalies, reducing false alerts by 20% and enabling early interventions, while ensuring compliance with clinical and regulatory guidelines.

• Designed and implemented robust ETL pipelines using Azure Data Factory, PySpark, and SQL to process large-scale patient records, claims data, and interaction logs, enabling automated workflows and supporting predictive analytics for population health and care management.

• Built and optimized ML models, including XGBoost, autoencoders, and scikit-learn, for patient readmission prediction, claim fraud detection, and anomaly identification, achieving 90% F1-score and 94% recall for high-risk healthcare events.

• Fine-tuned XGBoost and BERT models using Grid Search on Azure ML for clinical risk scoring and patient sentiment analysis, incorporating SHAP explainability to provide transparent, interpretable insights for care managers, auditors, and regulatory stakeholders.

• Applied NLP techniques, including BERT and TF-IDF with logistic regression, to patient surveys, call center transcripts, and support tickets, automatically identifying critical feedback and improving case resolution prioritization by 20%, reducing manual triage workload by 23%.

• Containerized models with Docker and deployed via Azure Kubernetes Service (AKS), integrating with care management systems for scalable real- time predictions, automating retraining pipelines with Airflow, and monitoring model performance using Prometheus and Grafana dashboards. Education

Arizona State University, Tempe, AZ, USA

Master of Science in Data Science and Analytics 08/2023 – 05/2025 Dayananda Sagar College of Engineering, Bengaluru, KA, India Bachelor of Engineering in Electronics and Communications 08/2019 – 05/2023 Projects

Portfolio Predictor LLM with MPT and Sentiment Analysis

• Developed an ML-driven portfolio assistant recommending investments based on user risk tolerance. Converted financial goals into a 0–1 risk score, allocated assets via Modern Portfolio Theory, and integrated news sentiment analysis with a RAG-based chatbot. Road Obstacle Identification and Tracking with Computer Vision

• Built a real-time ML model detecting up to 80 traffic objects using OpenCV and TensorFlow. Created a tracking algorithm assigning unique IDs to objects in CSV files, enabling classification. Won first place by Elektrobit’s expert panel.



Contact this candidate