Phanendhar Reddy
AI/ML Engineer
+1-732-***-**** *****************@*****.*** Harrison, NJ
SUMMARY
AI/ML Engineer with 5+ years of experience delivering scalable enterprise AI and machine learning solutions across consulting and technology environments. Designed and deployed end-to-end ML pipelines using PyTorch, Scikit-learn, and Apache Spark, improving model performance by 15–25% across fraud detection, customer churn, and demand forecasting use cases on datasets exceeding 20M+ records. Accelerated model deployment timelines by 50% by implementing robust MLOps frameworks with MLflow, Docker, and CI/CD pipelines, enabling real-time inference systems processing 5K+ predictions per minute. Developed streaming and batch inference architectures using Kafka and Spark Streaming to generate near real-time insights from millions of daily events. Proven ability to translate complex business challenges into production-grade AI solutions deployed across AWS and Azure, driving measurable impact in risk analytics, customer intelligence, and enterprise forecasting. EXPERIENCE
AI/ML Engineer, Deloitte CA (Enterprise AI Solutions for Risk, Forecasting & Customer Intelligence) 11/2024 - Present
Design and deploy machine learning pipelines in Python (PyTorch, Scikit-learn, Pandas) to support enterprise use cases including fraud detection, customer churn prediction, and demand forecasting across datasets with 20M+ records.
Build scalable data processing workflows using Apache Spark and SQL to transform and prepare 1–2 TB of financial and behavioral data, reducing data preparation time by 30% for downstream model training.
Develop and tune models including XGBoost, Random Forest, and neural networks, improving model performance metrics (AUC/accuracy) by 15–20% across classification and regression problems.
Deploy production models as REST APIs using FastAPI and Docker, supporting real-time inference services handling 5K+ prediction requests per minute across client applications.
Implement model lifecycle management using MLflow and CI/CD pipelines, reducing model deployment timelines from 10 days to under 5 days and improving experiment traceability.
Build streaming and batch inference pipelines using Kafka and Spark Streaming, processing 2–3M daily events to generate near real-time predictions for risk scoring and customer analytics.
Monitor model performance using cloud-based monitoring tools and custom dashboards, tracking latency, throughput, and prediction drift to maintain production model stability.
Collaborate with data engineers, business stakeholders, and consulting teams to translate business requirements into scalable AI solutions deployed across AWS and Azure environments. Machine Learning Engineer, IBM India (Enterprise AI Solutions & Predictive Modeling Platform) 04/2020 – 06/2023
Designed and deployed end-to-end machine learning solutions for enterprise clients across banking and retail sectors, impacting $200M+ in business operations through predictive analytics and optimization models.
Built supervised and unsupervised models (XGBoost, Random Forest, Neural Networks) on datasets exceeding 10M+ records, improving prediction accuracy by 20–25% across use cases.
Developed scalable data pipelines using Python and SQL for feature engineering and preprocessing, reducing data preparation time by 35% and improving model reliability.
Implemented NLP-based solutions for document classification and sentiment analysis, increasing processing efficiency by 40% compared to manual review workflows.
Led hyperparameter tuning and cross-validation experiments, optimizing model performance metrics (AUC, F1-score) and reducing overfitting risks.
Deployed ML models into production using Docker and Kubernetes, reducing model release cycles by 30% and ensuring high-availability inference endpoints.
Built automated monitoring and drift detection systems, reducing model performance degradation incidents by 38% in live environments.
Collaborated with cross-functional stakeholders to translate business requirements into measurable ML objectives, improving solution adoption by 25%.
EDUCATION
Masters in Information System Technologies
Wilmington University May 2025
TECHNICAL SKILLS
Large Language Models & Generative AI: LLM Fine-Tuning (7B–13B Models), Transformer Architectures, Retrieval- Augmented Generation (RAG), Prompt Optimization, Human Feedback Evaluation Distributed Training & GPU Optimization: PyTorch (DDP/FSDP), Parameter-Efficient Fine-Tuning (LoRA/PEFT), MultiGPU Training, Mixed Precision (FP16), Training Cost Optimization Inference Engineering & Model Serving: TorchServe, Kubernetes, Auto-Scaling, Low-Latency Inference, Batching & Quantization
(INT8), High-Availability ML Services (99.9%+)
MLOps & Experimentation: MLflow, Airflow, A/B Testing, Model Versioning, Reproducible Pipelines, Continuous Model Evaluation
Data Engineering for ML: Apache Spark, SQL, Kafka, Spark Streaming, Large-Scale Data Processing (TB-scale datasets), Feature Engineering Pipelines
Model Evaluation & Monitoring: BLEU, ROUGE, Perplexity, AUC Optimization, Prediction Drift Monitoring, Feature Stability Tracking, Performance Benchmarking
Enterprise Risk & Analytics Applications: Fraud Detection, Credit Risk Modeling, Customer Churn Prediction, Predictive Analytics, Financial Data Modeling
Cloud & Secure Deployment: AWS (S3, EMR), Azure, Secure VPC Deployments, Responsible AI Governance, Data Privacy & PII Masking Controls
Scalable ML System Design: Distributed Systems Fundamentals, Horizontal Scaling, High-Throughput Architectures, Concurrency Optimization, Production-Grade API Design Vector Search & Knowledge Systems: FAISS, Embedding Pipelines, Semantic Search, Context Retrieval Optimization, Knowledge Base Integration
Performance Profiling & Optimization: GPU Utilization Analysis, Memory Optimization, Throughput Benchmarking, Latency Reduction Strategies, Bottleneck Identification
Cross-Functional AI Delivery: Research-to-Production Collaboration, Stakeholder Communication, Model Risk Discussion, Translating Business Needs into AI Solutions