Research Intern Personal Assistant

Location:

Cumming, GA

Posted:

May 13, 2023

Contact this candidate

Resume:

Rishikesh Fulari

LinkedIn @rishikeshfulari Github @rishikeshF Huggingface @rishikesh Medium @rishikeshfulari

+1-470-***-**** ***************@*****.***

WORK EXPERIENCE

● Deep Learning Researcher, Purdue University, Fort Wayne (Part-time) Jan 2023 - Present

- Researching on Animal locomotion, devising efficient methods for markerless pose estimation based on transfer learning with deep neural networks DeepLabCut framework.

● Machine Learning Engineer, Get Up For Change Private Limited, Bangalore July 2021 - July 2022

- Achieved 98% precision score for address matching machine learning model by implementing innovative feature engineering techniques which led to 30% reduction in manual work of data analysts.

- Pretrained and prepared Large Language Models from scratch on a dataset with more than 6 million records resulting in a named entity recognition system which doubled the operations team’s efficiency.

- Collected, preprocessed, cleaned and analyzed millions of address records to prepare dataset for machine learning and deep learning models. Wrote a script to generate several millions of artificial addresses using existing datasets.

- Conducted research on text similarity matching problems for the Indian addresses and devised innovative approaches which formed the foundation for the ongoing research work there.

● Data Scientist, Celebal Technologies, Jaipur Apr 2021 - Jul 2021

- Fine-tuned Detectron2 for image recognition project over a dataset of around 100,000 documents

- Extracted and organized invoice details from pdf documents using computer vision

● NLP Research intern, Indian Institute of Information Technology, Allahabad Jan 2021 - Mar 2021

- Researched on Multi-domain Dialogue State Tracking along with the PhD candidates under the guidance of Prof. U. S. Tiwary for building data corpus of conversations in native Hindi language

- Examined, organized, translated and maintained conversational dialogues in text format from English to Hindi using python programming for building a personal assistant system in native language

● Digital Marketing Consultant, Hashtag Technologies, Coimbatore July 2020 - Dec 2020

- Devised content creation strategy and managed a team of content creators for several different clients to improve their organic reach through search engine optimisation.

- Conducted content audit, SEO technical audit of client websites.

● Operations Executive, Infosys, Nagpur July 2019 - July 2020

- Successfully executed network engineering tasks for USA based telecommunications industry clients.

- Designed logical circuits, configured switches and routers, resolved client side technical issues. SKILLS

● Languages & Frameworks : Python, C, C++, Java, SQL, R Language, Pandas, Tensorflow, PyTorch, Numpy

● Tools & Technologies : Docker, GCP, AWS, Git, Github, Sklearn, Streamlit, Gradio, Spacy EDUCATION

● Purdue University, Fort Wayne Aug 2022 - May 2024 Master of Science in Computer Science (MSCS) CGPA : 4.0/4.0

● University of Hyderabad, Hyderabad (Distance Learning - Online) Feb 2021 - Mar 2022 Diploma in Machine Learning and Artificial Intelligence (PGDMLAI) CGPA : 4.0/4.0

● Pune University, Pune May 2016 - May 2019

Bachelor of Science in Computer Science (BS) CGPA : 3.84/4.0 OPEN SOURCE CONTRIBUTION

● Unify-ivy is an open source framework that unifies all ML frameworks namely PyTorch, Tensorflow, Numpy and JAX. For example, a model code written in PyTorch can be run in a pipeline written in Tensorflow. It also enables developers to use any backend of their choice regardless of the framework model it is developed in; supporting all hardwares (like CPU, GPU, TPU). My pull request for ‘standard normal’ function was successfully merged in their codebase. PROJECTS

● Predicting the readmission rate of diabetic patients using machine learning for better healthcare. [blog][code]

- Implemented end-to-end Machine Learning pipeline right from data cleaning to model deployment on AWS EC2 instance, demonstrating full stack machine learning skills.

- Hospitals in the USA are penalized by the government if the patient is readmitted to the hospital within 30 days. Hospitals however have no means of predicting which patient will be readmitted. This project addresses this problem using machine learning by predicting which patient is likely to get readmitted within 30 days.

● Predicting the likelihood of conversion of a free-tier user to a paid one for an Ed-tech company. [blog][code][demo]

- Implemented end-to-end machine learning model to predict if the free-tier user would buy the subscription for e-learning platform ‘365 data science’ using real world platform analytics data.

- Data was provided by the Ed-tech platform and the final model was deployed on the Hugging Face Spaces using Streamlit as the front end framework.

● Developed and deployed an application to generate captions for the visually challenged people. [demo]

- Implemented deep learning model using pretrained models from Hugging face to generate captions for images which are further fed as input to text-to-speech API for reading aloud the captions. This project was made to help differently abled people browse the image content on the internet.

● Identifying duplicate questions using machine learning [demo]

- Implemented end-to-end machine learning model to predict if given two questions have the same semantic meaning.

- Used Quora question pair similarity dataset and embeddings from pretrained model for detecting semantic similarity. Forums and QnA sections are often filled with duplicate entries, this project was aimed at finding those duplicate questions using the recent advances in natural language processing domain like embeddings from pretrained models.

● Predicting the user engagement for celebrity tweets. [demo]

- Developed and deployed a machine learning model that predicts the user engagement - the number of retweets that particular tweet would get based on the semantic meaning and timestamp of the tweet.

- Scrapped twitter data from Justin Bieber’s twitter account and used it as the training data to predict the number of retweets his tweet would get.

● Devised a Named entity recognition system to convert unstructured Indian addresses to structured ones. [product]

- Indian addresses do not follow any particular structure and therefore suffer from plenty of formatting and spelling mistakes. This project aimed at implementing a named entity recognition model to convert those unstructured addresses to structured ones.

- Collected address data from various publicly available sources like OpenStreetMap, government websites, Indian postal department, etc. Created a synthetic dataset of 90 million addresses and trained a language model using masked language modeling.

- Further, fine-tuned this model for downstream NLP task of named-entity recognition to identify and label different components of the address. This is an industry project and is currently being used by my past employer. CERTIFICATIONS AND COURSES

● Machine Learning by Stanford University (through Coursera)

● Tensorflow Developer Professional certificate by DeepLearning.Ai (through Coursera)

● Natural Language Processing Nanodegree (through Udacity)

Contact this candidate