SATHVIK REDDY MUSKU
Machine Learning Engineer
CO, USA (551) 261 - 3201 **************@*****.*** LinkedIn SUMMARY REPORT
• Machine Learning Engineer with 3+ years of experience in designing, developing, and deploying ML models and AI-driven solutions across domains such as finance, healthcare, and e-commerce.
• Proficient in Python, R, and C++, with hands-on experience using ML frameworks like TensorFlow, PyTorch, Keras, and Scikit-learn for model development and performance optimization.
• Skilled in supervised and unsupervised learning, deep learning, NLP, time series forecasting, reinforcement learning, and generative models such as GANs.
• Experienced in building end-to-end ML pipelines, including data preprocessing, model training, hyperparameter tuning, deployment using Docker/Flask, and cloud integration (AWS/GCP).
• Hands-on experience with AI Agents and Generative AI, including Retrieval-Augmented Generation (RAG), Large Language Models (LLMs), LangChain, vector databases (Pinecone, FAISS), AI-powered clinical assistant agents, and prompt engineering for domain-specific applications. EXPERIENCE
ML Engineer Accenture, CO May 2024 – Current
• Engineered a robust ML pipeline using Python, Pandas, NumPy to preprocess and transform over 500K+ patient EHR records, incorporating structured data (vitals, labs, diagnosis codes) and unstructured notes via TF-IDF, NLP
(NLTK) for better feature representation.
• Developed, trained, and compared multiple supervised learning models including Random Forest and Logistic Regression using Scikit-learn, optimizing performance through Grid Search, Bayesian Optimization, and Cross-Validation, achieving 87% ROC-AUC and 20% F1 improvement.
• Addressed class imbalance with SMOTE and monitored model drift using Evidently AI, while integrating Bayesian Hyperparameter Tuning to further reduce overfitting in deployment environments.
• Deployed the production-ready model using Flask APIs, containerized via Docker, orchestrated through Kubernetes, and hosted on AWS Lambda, ensuring scalability and fault tolerance.
• Integrated insights with Tableau to provide dynamic dashboards for clinicians, displaying risk scores, predictive factors, and discharge planning triggers.
• Integrated clinical and patient data pipelines using MongoDB to manage semi-structured records (e.g., discharge notes, prescriptions), supporting dynamic model inputs and flexible schema management.
• Designed and implemented an AI-powered Clinical Assistant Agent leveraging RAG (Retrieval Augmented Generation) with LLMs, enabling clinicians to query EHR data and evidence-based medical guidelines in real time, improving clinical decision support and reducing manual chart review by 30%.
• Built custom RAG pipelines with LangChain & vector databases (Pinecone, FAISS) to retrieve patient history & medical literature, enhancing prediction explainability & ensuring clinician trust in AI-driven recommendations.
• Applied prompt engineering & LLM fine-tuning on de-identified clinical text under HIPAA, generating context- aware summaries & discharge instructions, reducing clinician documentation time.
• Delivered business impact: 20% reduction in 30-day readmissions, improved CMS compliance scores, and achieved
$2.5M annual savings, aligned with the client’s digital health transformation goals under Agile Scrum delivery. ML Engineer HCL Tech, India Jan 2021 – May 2023
Project 1
• Ingested and preprocessed multi-year historical data from NSE/BSE, RBI, mutual fund NAVs, and macroeconomic indicators using Python, SQL, and Pandas, handling missing timestamps, non-trading days, and outliers with custom imputation rules. Stored and managed time-indexed datasets in PostgreSQL for structured and semi- structured market feeds.
• Engineered time series features including moving averages, momentum indicators (MACD, RSI), rolling beta, and volatility bands using NumPy, SciPy, and domain-specific financial libraries, enabling deep signal extraction for daily and weekly trends.
• Developed and compared multiple forecasting models: LSTM (RNN) using Keras and TensorFlow to model sequential dependencies and nonlinear price trends. XGBoost for short horizon return regression. Benchmark models (ARIMA, Exponential Smoothing) for performance comparison using Scikit-learn and Stats models.
• Performed Bayesian Optimization and Randomized Search CV for hyperparameter tuning of LSTM layers
(neurons, dropout, learning rate), improving forecast accuracy (MAPE) by 25% over statistical baselines.
• Deployed trained models as containerized microservices using Flask, and TensorFlow Serving, integrated with the client’s in-house investment dashboard through secure REST APIs on AWS EC2, ensuring real-time prediction access.
• Set up CI/CD pipelines using MLflow for experiment tracking, automated retraining workflows, and Git for version control. Incorporated model monitoring with drift detection based on changes in asset return distributions.
• Designed dynamic visualization dashboards in Power BI and Advanced Excel for portfolio managers to track forecasted vs actual returns, model confidence intervals, and risk-adjusted performance (Sharpe ratio, Sortino ratio).
• Delivered the solution under Agile Scrum, coordinating with quant analysts, product owners, and DevOps teams over 3-month sprints. The platform led to a 40% reduction in manual model adjustment time, improved allocation precision, and enhanced responsiveness to market volatility.
• Integrated a custom LLM-based natural language layer using Open AI GPT API to auto-generate human- readable summaries of portfolio forecasts, risk insights, and model performance trends, improving communication with business teams and reducing manual reporting time by 60%. Project 2
• Created a custom video transcription and summarization system using Whisper (for voice-to-text) and a Transformer-based extractive summarizer, allowing real-time generation of lecture summaries and auto-indexing of 100K+ hours of video content.
• Developed a plagiarism detection engine using Generative Adversarial Networks (GANs) to identify paraphrased re- uploads of course material; achieved a 38% improvement in recall over traditional NLP-based comparison methods.
• Built a secure Java Spring Boot backend and Node.js/Express middleware to serve APIs for transcript storage, similarity scoring, and summary retrieval, with a responsive React UI used by content reviewers and instructors.
• Stored processed transcripts, summaries, and embeddings in PostgreSQL, and visualized detection reports using ggplot2 and Advanced Excel dashboards; enabled continuous updates through Bash-based CRON jobs and a lightweight C++ data parser for older video metadata ingestion. SKILLS
Methodologies Agile (Scrum, Kanban), Waterfall
Languages Python, R, C++, SQL, Bash scripting
ML Frameworks TensorFlow, Keras, PyTorch, Scikit-learn, XGBoost, NumPy, Pandas, Matplotlib, SciPy, ggplot2, Seaborn
Data Visualization Tools
Functional Expertise
Tableau, Power BI, Advanced Excel, Statistics
Convolutional Neural Networks (CNN), Recurrent Neural Networks (RNN), Transformers, GANs
Database MySQL, MongoDB, PostgreSQL, SQL Server
Model Development Supervised/Unsupervised learning, Time series forecasting, Anomaly detection, Reinforcement Learning
Cloud Platform
Optimization Techniques
Deployment & CI/CD
AWS (SageMaker, Lambda, EC2), GCP, Microsoft Azure (ML Studio, Azure Databricks) Hyperparameter tuning, Grid search, Randomized search, Bayesian Optimization, Genetic Algorithms
Docker, Kubernetes, MLflow, TensorFlow Serving, Flask/Django for model APIs Big Data Tools: Apache Spark, Hadoop, Dask, Kafka, Hive AI Agents & Generative AI Retrieval-Augmented Generation (RAG), Large Language Models (LLMs), LangChain, Vector Databases (Pinecone, FAISS), AI-powered Clinical Assistant Agents, Prompt Engineering
EDUCATION
Master of Science in Computer Science and Engineering University of Maryland Baltimore County, USA.
Bachelor of Technology in Information Technology (IT) Vasavi College of Engineering Hyderabad, Telangana, India