HARSH JINESH PARIKH
213-***-**** Los Angeles, CA ********@***.***
github.com/HarshJParikh linkedin.com/in/hparikh2000 EDUCATION
University of Southern California, Los Angeles, CA January 2023 – December 2024 Master of Science in Applied Data Science GPA: 3.8/4.0 Relevant Coursework: Data Management; Machine Learning; Data Mining; Database Systems; Data Science for Business & Economics University of Mumbai, India August 2018 – May 2022 Bachelor of Engineering in Electronics Engineering GPA: 9.10/10.0 IBM – Data Analytics Specialization January 2020 – May 2022 Relevant Coursework: Data Warehousing; Big Data Analytics; Applied Statistics (ANOVA, A/B Test, Chi-squared test); Predictive Analytics SKILLS
Programming Languages: Python, SQL, Java, R, Scala Frameworks & Libraries: PyTorch, Tensorflow, Keras, Pandas, Numpy, Matplotlib, Seaborn, Scikit-learn, Django, NLTK, OpenCV Tools: AWS, PySpark, Hadoop, Docker, MongoDB, Qlik Sense, Tableau, Power BI, Firebase, Airflow, ETL Pipeline, LangChain, HuggingFace Software Practices & Operations: Git, Linux/Unix, REST APIs, CI/CD Tools, Agile Methodologies, Jira, Unit Testing EXPERIENCE
AI Engineer Intern – Delet, Los Angeles, CA May 2024 – July 2024
• Designed an NLP-driven chatbot leveraging MongoDB to streamline responses to routine customer queries in real time, minimizing need for human intervention by handling frequently asked questions and simplifying support processes, resulting in a 37% reduction in support staff workload and increased overall efficiency.
• Collaborated with developers to integrate AI-based automation into a real estate platform using Streamlit, optimizing property management workflows, decreasing response times, and enriching user experience, resulting in a 15% increase in user engagement. Data Science Intern – Hope Digital, Mumbai, India June 2022 – November 2022
• Fine-tuned DistilBERT employing PyTorch for sentiment analysis on customer feedback, advancing accuracy by 20% through better handling of industry-specific language, leading to more precise insights.
• Developed a recommendation engine leveraging the bag of words model and TF-IDF vectors, enhancing personalization of product suggestions and improving recommendation relevance by 25%, which significantly increased customer engagement. Software Developer Engineer Intern - Phemesoft, Mumbai, India June 2021 – December 2021
• Developed a Django-based job portal with OAuth authentication and AJAX fusion, modernizing job posting and application management, which lowered hiring time by 30% through faster candidate processing and systematized application workflows.
• Implemented Test-Driven Development and CI/CD pipelines, automating testing and deployment, which improved project efficiency by 16%, and enabled real-time dashboards for faster, more reliable deployments.
• Presented AI-powered features exploiting IBM Watson and Docker, elevating recommendation accuracy by 34% and demonstrating portal's scalability to IBM experts, accelerating talent matching for businesses. PROJECTS
TABot: Smart Teaching Assistant Fine-tuning Llama-3, GPT4, RAG, Multimodal Agent, AWS, Reinforcement Learning, Tokenizer
• Invented TABot, a Virtual Teaching Assistant Multimodal Agent by fine-tuning Llama-3 and GPT-4 with Retrieval Augmented Generation
(RAG), training on 150+ lecture transcripts and 200+ forum discussions, maximizing relevance of query-specific responses by 30%.
• Incorporated Reinforcement Learning from Human Feedback (RLHF) with AWS Vector DB, OpenAI embeddings, and tokenizing techniques to refine chatbot policies for more reliable and accurate responses. Predictive Analytics for Healthcare Operations Python, SQL, Airflow, ETL, Apache Spark, XGBoost, Power BI
• Processed over 3 million patient records using Python, Apache Spark, and SQL for efficient data processing, and leveraged Airflow ETL pipelines to program and optimize workflows through task parallelization and efficient scheduling, reducing processing time by 20%.
• Experimented with XGBoost to develop a predictive model, achieving 86% prediction accuracy on patient data, contributing to more accurate insights and improved healthcare outcomes.
• Built real-time dashboards with Power BI, integrating data via REST APIs to automate reporting and provide up-to-date insights, enabling faster decision-making and boosting resource allocation efficiency by 30% through optimal use of personnel, and financial resources. Architectural Style and Landmark Classification ARIMA, Transfer Learning, Data Augmentation, VGG, ResNet
• Engineered a dual-classification model on architectural and landmark recognition using transfer learning, data augmentation, with VGG16 and ResNet50 architectures, achieving ~96% accuracy, improving automated content curation for travel and cultural heritage platforms.
• Assisted in creating a data pipeline for over 420 images, improving dataset quality, and applying ARIMA for time-series analysis of visual data trends, crucial for virtual tourism and digital archiving. LEADERSHIP & IMPACT
• Directed efforts as a Graduate Teaching Assistant, managing projects for 300 students in Foundations in Data Management and Introduction to Computational Thinking and Data Science, adopting tools like Hadoop, SQL, PySpark, and AWS, which standardized workflows and enhanced students' project performance and overall course satisfaction.
• Mentored K-12 students in Python, Java, and drone courses as a Summer Teaching Assistant at CS@SC Coding Camps, managing lessons, and fostering student engagement.