Sarthak Nikhal
*************@*****.*** linkedin.com/in/sarthaknikhal/ github.com/SarthakNikhal Education
Clemson University August 2023 – May 2025
Master of Science in Computer Science 4.0/4.0
• Coursework: Applied Data Science, Deep Learning, Software Development Methodologies, Human Centered Computing
Pune University Aug. 2019 – May 2023
Bachelor of Engineering in Computer Engineering 8.7/10.0
• Coursework: Big Data Analytics, Business Intelligence, Computer Networks, Software Design, Microprocessors Experience
Machine Learning Intern August 2024 – December 2024 Symbolic Mind, LLC San Diego, CA
• Built benchmarking, data preprocessing and analytics pipeline making LLM testing 70-80% efficient.
• Studied 10+ LLM benchmarking research papers including Llama and GPT.
• Worked on customizing, hyperparameter tuning and generative model reverse engineering for Vicuna and Alpaca using Pytorch. Performed testing of models on the HelM benchmarking platform. Data Scientist August 2024 – December 2024
Clemson University International Center for Automotive Research Greenville, SC
• Assembled a data processing and analysis pipeline for data driven optimal control of manufacturing lines for Bosch.
• Applied 10+ ML/statistic algorithms on the mechanical parts data to recommend optimal input settings.
• Created a statistical parameter selection algorithm that will stop production of defective parts by 30% Machine Learning Researcher May 2022 – November 2022 Indian Institute of Technology Kharagpur, WB, India
• Experimented with Natural Language Processing (NLP) pipelines along with extensive research in 15+ LLMs and healthcare data.
• Practiced fine-tuning on all models to get an accuracy of 84% or above with HL7 data and Snomed.
• Built ML models like Zero Shot Learning, Few Shot Learning and Transformers for text using Python Pytorch, Hugging Face and Matplotlib.
Projects
Article Researcher LLM W LangChain, Python, RAG, Streamlit February 2025 – April 2025
• Built an AI-powered article analysis tool using NLP and semantic search to extract insights from PDFs and URLs, helping users make faster financial and research decisions.
• Integrated natural language Q&A and vector-based search using FAISS, enabling users to ask questions and get precise, source-linked answers.
• Designed an interactive Streamlit interface for easy file uploads, article browsing, and question answering in a user-friendly way
Landslide Susceptibility Mapping W Python, Numpy, Keras, TensorFLow October 2022 – April 2023
• Implemented a U-Net architecture in Python using TensorFlow/Keras to perform semantic segmentation on multi-channel satellite imagery for land cover classification.
• Studied 30+ research papers on CNNs, image processing and masking.
• Preprocessed and visualized large geospatial datasets stored in HDF5 format, leveraging libraries like h5py and matplotlib for efficient data handling and exploratory analysis. Technical Skills
Languages: C++, Java, Python, R, HTML, CSS, JavaScript, SQL, MATLAB, Kotlin Frameworks: React, Node.js, Flask, Material-UI, Spark, Rest API, GraphQL, FastAPI, Shoelace, Streamlit Developer Tools: Git, Bitbucket, Docker, Google Cloud Platform, AWS, Databricks, VS Code, Visual Studio, Apache Airflow, MLflow, Power BI, Microsoft Office
Libraries: pandas, NumPy, Matplotlib, LangChain, scikit-learn, Spark, multiprocessing, PyTorch, ExpressJS