Dhruval Patel
Delivered production-grade Machine Learning, Data Design Pipelines and Research
Xyton Semiconductors: Saved 15K$ delivering and designing Multi-stage Machine Learning Pipelines at scale on AWS (Model Ownership) Indian Space Research Organization: Researched, delivered and replaced traditional slow mathematical geospatial data synthesis Models using ML Richardson, TX 75080 +1-469******* *******.*****.**@*****.*** LinkedIn Profile GitHub Profile PUBLICATIONS
“Prediction of Heart Failure using Machine Learning Algorithms”, Indian Journal of Natural Sciences, Tamil Nadu Scientific Research Organisation, Web of Science Scopus Journal (WoS), vol.14, issue 82. (Feb 2024) SKILLS
• ML/AI: Tensorflow, pytorch, timm, Spark ML, Sci-kit Learn, XGBoost, LightGBM,, Gen AI, ollama, langchain, Fine tuning, RLHF, LLM, RAG
• Data Engineering: Git, Python, SQL, Databricks, Apache Spark, Kafka, AWS (Glue, Redshift, Kinesis, DynamoDB, Events, S3), Airflow, Hadoop
• Analysis and Statistics: matplotlib, seaborn, AWS Quicksight, PowerBI, Tableau, R, numpy, pandas,
• Scripting: Bash, Scala, boto3, selenium, crontab. PROFESSIONAL EXPERIENCE
Machine Learning Engineer Intern
Xyton Semiconductors, Dallas, TX (Jan – May 2026)
• Built a multi-stage pipeline using YOLO (object detection) and Vision Transformers (ViT), achieving ~90% detection accuracy on real-world technical dataset. Improved overall system recall to ~99% by designing a second-stage ViT reclassification model to correct YOLO misclassifications using inference-generated dataset.
• Engineered data-centric improvements by identifying label noise, class imbalance, and unseen data distributions, and augmenting datasets to enhance model robustness.
• Architected AWS-based large-scale data processing pipelines, enabling efficient LLM pre-training and fine-tuning across multimillion-record datasets.
Artificial Intelligence Engineer Intern
IQuadra LLC, Marietta, GA (Jun - Aug 2025)
• Engineered a containerized mortgage loan processing MVP using React, Tailwind, Python, LangChain, Ollama, ChromaDB.
• Orchestrated document extraction with OCR (Tesseract) and in house AI based logic reducing underwriter’s manual effort by 80%.
• Deployed Scalable application on AWS (EC2, S3, Aurora DB) using Docker to manage user data and generate structured financial summaries. Machine Learning Research Intern
Indian Space Research Organisation, Space Application Centre, IN (Jan 2023- Jun 2024)
• Built an on-premise data Lakehouse pipeline to transfer and preprocess large-scale geospatial data from HPC servers to local Linux workstation.
• Processed and transformed raw NetCDF data into clean, train-ready CSV datasets, preserving seasonal and spatial variability for Model Training.
• Researched and Evaluated ML models with cross-validation, correlation checks, and visualization workflows on geospatial data reducing synthesis time by 75% and publishing findings in a peer-reviewed research paper with scientists at ISRO. ACADEMIC PROJECTS
Cardiovascular Disease Prediction System
• Led development of ML models for heart failure prediction achieving 92% accuracy using ensemble methods like Boosting Regression.
• Published research findings in Web of Science indexed journal and presented at BVM university conference which was ranked in top 5 projects.
• Developed end-to-end Machine Learning pipeline to process over 100,000 medical records, integrating Spark for parallel processing, resulting in a 20% reduction in overall processing time.
HiPriority: Intelligent Email Prioritization
• Engineered Spark Streaming pipeline with Google Email API to process 1,000+ emails.
• Designed an AWS data lake using Glue & S3 for ELT and structured JSON storage (sender, recipient, timestamps).
• Deployed the solution on AWS Cloud and applied Llama 3.2 model for parsing, summarization, and classification with AI Driven Prioritization to notify users only of relevant emails.
Real-Time Financial Data Engineering Pipeline
• Configured and implemented an automated ETL pipeline for real-time financial stock data using Kafka, Spark Streaming, and Logstash.
• Orchestrated a scalable and distributed architecture that transforms and enriches live data for visualization with Elasticsearch and Kibana.
• Delivered a robust, production-ready pipeline for actionable insights and data-driven decisions. EDUCATION
Master’s in Computer Science (Aug 2024 - May 2026) The University of Texas at Dallas, Richardson, TX (GPA: 3.76/4) Bachelor of Technology in Computer Science (Aug 2020 – Apr 2024) Indus University, Ahmedabad, IN (GPA: 3.9/4)
PERSONAL ACHEIVEMENTS
• Contribution in global Cancer Concern mass awareness campaign donation collection cancer/ aids, Volunteering for Indian Student Association.
• Erik Jonsson School Ambassador
CERTIFICATIONS
• AWS Certified Cloud Practitioner : https://bit.ly/aws-certification-dhruval (Sept 2025 - Sept 2028)
• Diploma in Multi Lingual Computer Programming, Centre for Development of Advanced Computing (Oct 2020 - Oct 2021)
• IBM Data Science Certification, Databricks, Spark (Oct 2023)