Senior AI/ML Engineer Sequence Modeling & MLOps

Location:

India

Salary:

85000

Posted:

February 10, 2026

Contact this candidate

Resume:

Suhuruth Veeramalla

AI ML Developer

San Francisco, CA Open for relocation to Seattle, WA New York, NY Chicago, IL Dallas, TX *********@*****.*** +1-626-***-**** LinkedIn

SUMMARY

AI/ML Engineer with 5+ years of experience designing and deploying large-scale machine learning, recommendation, and personalization systems across e-commerce, retail, and enterprise platforms. Skilled in deep learning, MLOps, and multi-cloud deployment using PyTorch, Tensor Flow, AWS, Azure, and GCP, with strong expertise in sequence learning, vector search, and generative AI. Adept at building data-driven solutions that improve model performance, operational efficiency, and customer engagement through advanced analytics and automation. PROFESSIONAL EXPERIENCE

Meta – AI/ML Developer (sequence learning & Andromeda) April 2024 – Present AI ML Engineer San Francisco, CA

Contributed to the development of sequence-learning models that replaced legacy DLRM pipelines, improving ad retrieval recall by 6 % and ad-quality by 8 % for personalized ads in e-commerce and retail segments.

Worked on a dense-embedding retrieval system processing tens of millions of ads within 200 ms latency, supporting a 22

% increase in ROAS for Advantage+ automated campaigns.

Assisted in integrating creative automation and catalog-based ad workflows with merchant and product teams, resulting in a 40 % reduction in ad production time for e-commerce campaigns.

Implemented transformer-based sequence models using PyTorch 2.0 and JAX, applying attention mechanisms for large- scale user-event modeling and ranking optimization.

Developed vector retrieval modules utilizing Faiss/HNSW indexing, hierarchical candidate ranking, and GPU acceleration on NVIDIA Grace Hopper and Meta MTIA hardware.

Built real-time ML data pipelines with Apache Kafka, Spark Structured Streaming, Delta Lake, and Feast feature stores to support continuous model training and deployment.

Deployed and maintained ML models on Kubernetes (AWS EKS) with MLflow 2.0, automated CI/CD pipelines, and performance monitoring using Prometheus and Grafana.

Applied MLOps practices such as A/B testing, model-drift detection, and continuous training workflows to ensure system reliability and high model availability (99.9 % uptime).

Collaborated with product, infrastructure, and data-science teams to communicate technical results, align on KPIs, and address performance and scalability challenges.

Explored generative AI, multimodal embeddings, and on-device ML optimizations (quantization/INT8) to enhance future retail personalization and stay aligned with emerging AI/ML trends of 2025. Accenture April 2020 - July 2023

AI/ML Developer India

Designed and deployed multilingual conversational AI platforms using Python, PyTorch, Riva ASR/TTS, TensorRT, and CUDA reducing inference latency by 45% and enabling scalable deployments across cloud, edge, and on-premise environments.

Fine-tuned ASR and TTS models with supervised and sequence-to-sequence architectures on domain-specific datasets, increasing accuracy and contextual speech synthesis quality by 30% across enterprise use cases.

Architected NLP-driven conversational frameworks integrating embeddings, transformers, and dialogue orchestration, improving query resolution by 25% and customer satisfaction across global support channels.

Built multimodal AI pipelines with NVIDIA NeMo, Apache Spark, and SQL to process voice, text, and semantic data for highly regulated sectors including healthcare (HIPAA) and finance (PCI-DSS).

Optimized inference workloads through TensorRT, CUDA kernels, and MLflow integration, achieving a 40% latency reduction and 35% improvement in GPU throughput across large-scale deployments.

Streamlined deployment and lifecycle management via Docker, Kubernetes, and AWS SageMaker, establishing a unified MLOps matrix for CI/CD and cross-environment orchestration.

Developed automated monitoring pipelines for model drift, data quality, and inference health, reducing drift-related incidents by 25% and ensuring robust governance across production models.

Led benchmarking, load testing, and GPU utilization analysis to achieve 60% higher compute efficiency while maintaining real-time SLAs for conversational and multimodal workloads.

Collaborated with enterprise architects and AI engineers to standardize domain-specific ML workflows, embedding safety, compliance, and audit-ready documentation into every release.

Partnered with cross-functional delivery teams to scale AI solutions globally, accelerating release cycles by 50% and maintaining 99.9% system uptime across multi-cloud environments. TECHNICAL SKILLS

Programming & Frameworks: Python (advanced), PyTorch 2.0, TensorFlow, JAX, Scikit-learn, NumPy, Pandas, FastAPI, Flask, REST APIs, SQL, NoSQL, GraphQL, Shell Scripting, Git, GitHub, Docker Machine Learning & Deep Learning: Supervised / Unsupervised Learning, Transformer Models, Sequence Learning, Recommendation Systems, Ranking Algorithms, Embedding Models, Attention Mechanisms, Feature Engineering, Hyperparameter Optimization, Transfer Learning, Time-Series Forecasting, Computer Vision (CV), NLP (LLMs, BERT, GPT), Generative AI, Multimodal Learning, On-device ML (Quantization, INT8), Responsible AI (Explainability, Fairness, Bias Mitigation) MLOps & Model Deployment: MLflow 2.0, Kubeflow, Airflow, CI/CD Pipelines, Model Versioning, Model Drift Detection, A/B Testing, Experiment Tracking, Continuous Training, Data Validation, Monitoring (Prometheus, Grafana), Model Serving, API Deployment, Containerization (Docker, Kubernetes, AWS EKS) Big Data & Data Engineering: Apache Kafka, Spark Structured Streaming, Delta Lake, Hive, Hadoop, Feature Stores (Feast), Data Lakehouse Architecture, ETL Pipelines, Data Preprocessing, Real-time Stream Processing, Batch Processing, Apache Airflow, dbt

(Data Build Tool)

Cloud & Infrastructure: AWS (SageMaker, EC2, S3, Lambda, EKS), Azure ML, Google Cloud Vertex AI, GCP BigQuery, Cloud Functions, Multi-Cloud Deployments, Serverless Computing, GPU/TPU Optimization (NVIDIA Grace Hopper, Meta MTIA), Terraform (IaC), Monitoring & Logging

Search & Retrieval Systems: Faiss, HNSW, Vector Databases (Pinecone, Weaviate, Qdrant), Semantic Search, Dense Retrieval, Approximate Nearest Neighbors (ANN), Embedding Indexing, Ranking & Personalization Pipelines Software Development & Tools: Agile / Scrum, Version Control (Git, GitHub, Bitbucket), Jira, Confluence, Unit & Integration Testing (PyTest), API Documentation, Software Design Patterns, Microservices, Performance Optimization Soft Skills: Cross-functional Collaboration, Analytical Thinking, Problem-solving, Communication & Presentation Skills, Stakeholder Management, Adaptability, Continuous Learning, Attention to Detail, Teamwork EDUCATION

Concordia University, USA

Master’s in Computer Science

Contact this candidate