Post Job Free
Sign in

Data Scientist

Location:
Rochester, NY
Posted:
October 03, 2024

Contact this candidate

Resume:

Hetav Patel

******@***.***• 585-***-**** • Rochester, NY •Github

EDUCATION

ROCHESTER INSTITUTE OF TECHNOLOGY Rochester, NY

Master of Science in Data Science (GPA: 4.0/4.0)ExpectedDec 2025 Relevant Coursework:Fundamentals of Data Science,Explainable AI, Non-relational Database, Applied Statistics. THE LNM INSTITUTE OF INFORMATION TECHNOLOGY Jaipur,India Bachelor of Technology in Computer Science and Engineering (GPA: 3.2/4.0)May 2023 Relevant Coursework:Relational Database ManagementSystems, Data Mining, Artificial Intelligence. SKILLS

Languages:Python, Java, SQL, Bash(Shell), C++, XML,MongoDB Technical Frameworks:scikit-learn, pandas, NumPy,Seaborn, Pytorch, Tensorflow, Data Science pipeline (cleaning, analytics, engineering, visualization, predictive modeling, interpretation), Git, Docker Technical Skills:Machine Learning, Deep Learning,, Explainable AI, Large-Language Models, NLP, Hypothesis Testing WORK EXPERIENCE

ROCHESTER INSTITUTE OF TECHNOLOGY Rochester, NY

Graduate Research Assistant Aug 2024 - Present

• Developing a novel Explainable AI (XAI) model to address transparency and robustness issues in deep learning-based cybersecurity systems.

• Evaluating 10 XAI models on fidelity, stability, and robustness to enhance explanation usability for security practitioners.

• Increasing interpretability by 30% using large language models (LLMs) for human-readable explanations. ROCHESTER INSTITUTE OF TECHNOLOGY Rochester, NY

Graduate Teaching Assistant Aug 2024 - Present

• Mentoring 40+ graduate students in data science, focusing on machine learning, SQL, and Python, improving project outcomes by 20%.

• Designing industry-relevant assignments and projects in collaboration with the faculty. THE LNM INSTITUTE OF INFORMATION TECHNOLOGY Jaipur,India Machine Learning Research Intern Sep 2022 - Mar 2023

• Collaborated with distinguished professors and students who proposed a unique animal behavior detection model and conducted tests, achieving a 81% success rate in identifying specific behaviors.

• Utilized DeepPoseKit to annotate 2000+ images of pets and generate skeletal data for pose estimation training.

• Employed a CNN model using Pytorch to predict animal poses with a top accuracy of 84% analyzing behavior. PROJECTS

GoodWorks: TEXT-BASED PREDICTION OF BOOK REVIEW POPULARITY Natural Language Processing(NLP), Feature Engineering, Sentiment Analysis

• Analyzed 17 million Goodreads book reviews to gauge review popularity among books using multiple data pipelines.

• Employed NLTK for text processing techniques like sentiment analysis, tokenization, and lemmatization.

• Achieved highest accuracy of 82% in predicting review popularity with a tuned XGBoost algorithm and Logistic Regression. SOFTWARE REFACTORING PREDICTION MODEL

Software Refactoring, Data Engineering, Feature Selection

• Achieved a top accuracy of 97.3% in predicting software refactoring opportunities with a custom Decision Tree, across 420 models created using Grid Search and only 1.2% of the original dataset.

• Performed cross-domain validation to assess model performance across 3 diverse datasets forked from Apache and Github.

• Conducted feature importance analysis with Random Forests, ranking the 52 key metrics driving refactoring predictions. Sentinel: INTRUSION DETECTION SYSTEM USING KDD99

Cybersecurity, Principal Component Analysis, Data Augmentation

• Addressed the class imbalance in the 500,000 entries of KDD99 dataset using SMOTE and Random Oversampling.

• Managed to reduce model training time by 25% employing techniques like PCA and K-Best Feature Selection.

• Attained a top accuracy of 98.6% on unseen data with LinearSVM and 97.5% using a Decision Tree Classifier.



Contact this candidate