Post Job Free
Sign in

Senior Python Backend Engineer (LLM & MLOps)

Location:
San Francisco, CA
Posted:
February 26, 2026

Contact this candidate

Resume:

SUMMARY

ABRAR SHAIK

Python Developer

San Francisco, CA Open for relocation to Seattle, WA Plano, TX Chicago, IL New York, NY 630-***-**** *****.********@*****.*** https://www.linkedin.com/in/abrar-s-354b2836a/ Python Developer with 5+ years of experience delivering scalable APIs, LLM inference systems, and cloud-native architectures across Perplexity AI, Meta, and Accenture. Proven expertise in Python, FastAPI, Flask, gRPC, MLOps, and multi-cloud (AWS/GCP/Azure), with hands-on work in LLM integration, observability, and automation. Adept at cross- functional collaboration, mentoring, and driving high-performance, fault-tolerant systems that improve user engagement, reliability, and business impact.

PROFESSIONAL EXPERIENCE

Perplexity AI – pplx-API & Comet / Model Integration Project January 2024 – Present Backend Software Engineer (Python) San Francisco, CA

Developed pplx-API backend powering low-latency LLM inference, reducing response times 70% while seamlessly scaling to millions of Pro/Max production queries.

Integrated LLMs including GPT-4.1, Claude, LLaMA, and Mistral supporting JSON/Regex structured outputs, doubling developer adoption and delivering enterprise-grade reliability across diverse environments.

Contributed to launch of Comet AI browser functionalities and Perplexity Max platform features, driving improved engagement, stronger user retention, and measurable customer growth.

Built scalable inference-ready Python APIs leveraging FastAPI, Flask, and gRPC while optimizing model performance using TensorRT-LLM across AWS GPU-based clusters.

Designed, containerized, and deployed highly-available microservice ecosystems using Docker, Kubernetes, and auto-scaling strategies ensuring reliable fault-tolerant cloud-native backend infrastructures.

Implemented automated MLOps pipelines using MLflow, Weights & Biases, and proprietary model stores, streamlining rollout/rollback processes and monitoring critical LLM model deployments.

Developed structured response generation features powering RAG, summarization, and automation workflows, enabling advanced client-side application integrations at enterprise production scale.

Enhanced observability stack by integrating Prometheus, Grafana, and ELK/Loki, delivering advanced diagnostics, system monitoring, and SLA-compliant real-time metrics visibility.

Automated deployment lifecycles with CI/CD GitHub Actions and Jenkins, reinforcing quality with PyTest-driven tests and optimized throughput via caching and dynamic batching.

Collaborated within Agile/Scrum cross-functional teams, mentoring junior developers, ensuring continuous code quality improvements, and demonstrating strong technical communication and problem-solving capabilities. Meta February 2023 – December 2023

Software Engineer San Francisco, CA

Developed and fine-tuned LLaMA 2/3 large language models using PyTorch for generative ad copy, improving Meta Ads click-through rates by 11% globally.

Built and deployed vision diffusion models including Emu for image background generation and video animation, increasing ad engagement 13% while accelerating creative delivery processes.

Implemented automated model retraining workflows seamlessly integrated into CI/CD pipelines via GitHub Actions and Jenkins, ensuring rapid iteration, reliability, and reproducibility in deployment cycles.

Engineered real-time ad recommendation models (GEM) leveraging deep neural networks and reinforcement learning, generating 7.6% measurable lift in conversions across 4M+ advertiser base.

Designed scalable inference services with Docker, Kubernetes, and Meta MTIA accelerators, reducing latency by 40% while sustaining over 10M+ daily requests.

Optimized multi-modal AI pipelines for Reels and Feed personalization using transformers across text-image models, boosting user engagement and time spent up to 8%.

Deployed optimized generative models with ONNX, TorchScript, and quantization, maintaining sub-100ms inference latency consistently across globally distributed Meta data centers.

Built scalable database integration pipelines with Spark SQL and Presto, enabling fast, distributed ad platform data retrieval for ML-powered personalization services.

Orchestrated A/B testing and canary rollout of generative models via FBLearner Flow ML Ops, verifying over 5% incremental production ad performance lift.

Partnered with Product, Data Science, and Responsible AI teams to embed invisible watermarks, safety filters, and brand compliance across generative AI-driven creatives.

Designed Spark, Presto, and Scuba pipelines processing petabyte-scale ad interaction logs, enabling automated retraining cycles and actionable business intelligence for marketing optimization.

Integrated text-to-image and text-to-video generation into Ads Manager platform, accelerating creative workflows by 60% and scaling AI augmentation access to 1M+ advertisers. Accenture May 2019 – June 2022

Python Application Developer India

Developed and deployed Python-based automation workflows that streamlined enterprise reporting and analytics, reducing manual processing time by 50% and improving client decision-making speed.

Built scalable ETL pipelines to integrate multi-source business data (ERP, CRM, finance systems), enhancing data accuracy and supporting real-time dashboards for global clients.

Designed and maintained backend services in Python (Flask, FastAPI) with strong integration to SQL/NoSQL databases, implementing data validation, monitoring, and fault tolerance.

Leveraged AWS (EC2, S3, Lambda, RDS) and Azure services to deliver cloud-native solutions, while implementing CI/CD (Jenkins, GitHub Actions) and test automation (PyTest) for reliability.

Collaborated in Agile/Scrum teams, applying problem-solving, communication, and stakeholder management skills, while mentoring junior developers and adopting best practices in MLOps, observability (Prometheus, Grafana), and containerization (Docker, Kubernetes).

TECHNICAL SKILLS

Programming & Scripting: Python (FastAPI, Flask, Django, gRPC), SQL, Bash, JavaScript/TypeScript (basic integration) Backend & Distributed Systems: RESTful APIs, Microservices Architecture, Distributed Systems, API Design, Async Processing, Cloud-Native Backend Development

AI/ML & Data Engineering: Large Language Models (GPT-4.1, Claude, LLaMA, Mistral), PyTorch, Hugging Face Transformers, TensorRT-LLM, Retrieval-Augmented Generation (RAG), Semantic Search, Recommendation Systems, NLP

(Text Classification, Sentiment Analysis), ETL & Data Ingestion Pipelines, Real-Time Data Processing, Vector Databases

(FAISS, Pinecone, Weaviate, Milvus)

MLOps & Model Lifecycle: MLflow, Weights & Biases (W&B), Airflow, Model Deployment & Monitoring, Feature Stores, Amazon SageMaker, Kubeflow, Automated Rollout & Rollback Cloud Platforms & Infrastructure: AWS (EC2, S3, Lambda, RDS, DynamoDB, SageMaker, GPU Clusters), Google Cloud Platform (BigQuery, Pub/Sub), Microsoft Azure (App Services, Functions, Storage), Multi-Cloud Architecture, Auto-Scaling

& High Availability

Event Streaming & Messaging: Apache Kafka, Google Pub/Sub, AWS Kinesis, Event-Driven Architecture DevOps & CI/CD: Docker, Kubernetes (EKS, GKE, AKS), Terraform (Infrastructure as Code), Helm, GitHub Actions, Jenkins Observability & Performance Engineering: Prometheus, Grafana, ELK Stack, Loki, OpenTelemetry, System Monitoring, Performance Optimization (Caching, Batching, Latency Reduction), Load & Stress Testing Databases & Storage: PostgreSQL, MySQL, DynamoDB, Redis, Elasticsearch, Time-Series Databases (InfluxDB, Prometheus TSDB)

Data & Distributed Processing: Apache Spark, Spark SQL, Presto, Batch & Stream Processing Testing & Quality Engineering: PyTest, Unit & Integration Testing, Automated Testing, A/B Testing Version Control & Collaboration: Git, GitHub, Git-Based Workflows (Branching, Pull Requests, Code Reviews) Professional Skills: Agile/Scrum, Cross-functional Collaboration, Mentoring, Technical Communication, Stakeholder Management, Problem Solving, Ownership & Delivery

EDUCATION

Master of Science in Information System

Trine University, USA



Contact this candidate