Hetav Patel
******@***.***• 585-***-**** • Rochester, NY •Github
EDUCATION
ROCHESTER INSTITUTE OF TECHNOLOGY Rochester, NY
Master of Science in Data Science (GPA: 4.0/4.0)ExpectedDec 2025 Relevant Coursework:Fundamentals of Data Science,Explainable AI, Non-relational Database, Applied Statistics. THE LNM INSTITUTE OF INFORMATION TECHNOLOGY Jaipur,India Bachelor of Technology in Computer Science and Engineering (GPA: 3.2/4.0)May 2023 Relevant Coursework:Relational Database ManagementSystems, Data Mining, Artificial Intelligence. SKILLS
Languages:Python, Java, SQL, Bash(Shell), C++, XML,MongoDB Technical Frameworks:scikit-learn, pandas, NumPy,Seaborn, Pytorch, Tensorflow, Data Science pipeline (cleaning, analytics, engineering, visualization, predictive modeling, interpretation), Git, Docker Technical Skills:Machine Learning, Deep Learning,, Explainable AI, Large-Language Models, NLP, Hypothesis Testing WORK EXPERIENCE
ROCHESTER INSTITUTE OF TECHNOLOGY Rochester, NY
Graduate Research Assistant Aug 2024 - Present
• Developing a novel Explainable AI (XAI) model to address transparency and robustness issues in deep learning-based cybersecurity systems.
• Evaluating 10 XAI models on fidelity, stability, and robustness to enhance explanation usability for security practitioners.
• Increasing interpretability by 30% using large language models (LLMs) for human-readable explanations. ROCHESTER INSTITUTE OF TECHNOLOGY Rochester, NY
Graduate Teaching Assistant Aug 2024 - Present
• Mentoring 40+ graduate students in data science, focusing on machine learning, SQL, and Python, improving project outcomes by 20%.
• Designing industry-relevant assignments and projects in collaboration with the faculty. THE LNM INSTITUTE OF INFORMATION TECHNOLOGY Jaipur,India Machine Learning Research Intern Sep 2022 - Mar 2023
• Collaborated with distinguished professors and students who proposed a unique animal behavior detection model and conducted tests, achieving a 81% success rate in identifying specific behaviors.
• Utilized DeepPoseKit to annotate 2000+ images of pets and generate skeletal data for pose estimation training.
• Employed a CNN model using Pytorch to predict animal poses with a top accuracy of 84% analyzing behavior. PROJECTS
GoodWorks: TEXT-BASED PREDICTION OF BOOK REVIEW POPULARITY Natural Language Processing(NLP), Feature Engineering, Sentiment Analysis
• Analyzed 17 million Goodreads book reviews to gauge review popularity among books using multiple data pipelines.
• Employed NLTK for text processing techniques like sentiment analysis, tokenization, and lemmatization.
• Achieved highest accuracy of 82% in predicting review popularity with a tuned XGBoost algorithm and Logistic Regression. SOFTWARE REFACTORING PREDICTION MODEL
Software Refactoring, Data Engineering, Feature Selection
• Achieved a top accuracy of 97.3% in predicting software refactoring opportunities with a custom Decision Tree, across 420 models created using Grid Search and only 1.2% of the original dataset.
• Performed cross-domain validation to assess model performance across 3 diverse datasets forked from Apache and Github.
• Conducted feature importance analysis with Random Forests, ranking the 52 key metrics driving refactoring predictions. Sentinel: INTRUSION DETECTION SYSTEM USING KDD99
Cybersecurity, Principal Component Analysis, Data Augmentation
• Addressed the class imbalance in the 500,000 entries of KDD99 dataset using SMOTE and Random Oversampling.
• Managed to reduce model training time by 25% employing techniques like PCA and K-Best Feature Selection.
• Attained a top accuracy of 98.6% on unseen data with LinearSVM and 97.5% using a Decision Tree Classifier.