Post Job Free
Sign in

Research Intern Data Scientist

Location:
Wilmette, IL
Posted:
November 06, 2022

Contact this candidate

Resume:

Gaurav Sharma

adtdqu@r.postjobfree.com GitHub: gaushh LinkedIn: gauravsharmajpr Mobile No.: +1-773-***-**** EDUCATION

Northwestern University MS, Artificial Intelligence 2022-2023 SRM Institute of Science and Technology B.Tech, Electronics and Communication Engineering 2015-2019 WORK EXPERIENCE

Data Scientist Extramarks Education India Pvt. Ltd., India Jul’20 - Aug’22

Built a slang detection module that works on Hindi-English code-switched messages. Curated a custom dataset for the problem and fine- tuned a BERT-based model to detect offensive messages. Achieved an accuracy of 96% on the test set.

Created a sophisticated pipeline for an MCQ generation engine to automatically generate candidate MCQs from a given text. A T5 Trans- former model was fine-tuned on the SQUAD dataset to generate a question given a keyphrase and its context. In the generation phase, keyphrases were extracted from the given learning content using EmbedRank.Then the distractors for the generated MCQ were retrieved from conceptnet, an open knowledge graph.

Developed a conversation assistant to that understands users’ queries, determines their intent, extracts entities, and appropriately responds to the messages by answering FAQs and guiding them to relevant learning content upon being prompted.

Constructed an occlusion detection module for Extramarks’ proctoring engine to detect if a user’s face is occluded while taking a test on the study app. Modified the architecture of the resnet50 model to fit the multi-label classification task and then trained to predict the occlu- sion of the user’s eyes, nose, mouth, and chin.

Created a novel one-shot learning based document alignment system. The training dataset was synthetically generated by projecting a well-aligned template image on random backgrounds. A resnet18 model was then trained to detect the four corners of the projected docu- ment using which the document is aligned.

RESEARCH EXPERIENCE

Research Intern IIT Guwahati, India Nov’19 - Apr’20

Developed a search engine to provide users with intelligent suggestions for ayurvedic medicines. Parsed PDFs containing ayurvedic medicine data, cleaned them and stored them in JSON format.

Trained a spell checker using character-level sequence-to-sequence GRUs. Synthetically generated training data by performing semi-random operations like adding, deleting, substituting, and transposing characters on the Wikipedia dump dataset. Research Intern IIT Guwahati, India Jul’19 - Nov’19

Analyzed and compared the performance of evolutionary algorithms on the single-unit production planning problem.

Formulated a novel Production Planning approach by modifying the formulation to account for the impact of byproducts on revenue. PROJECTS

Polynomial Simplifier Aug’22

Trained a seq2seq model with attention mechanism to create a Polynomial Simplification module. The model takes a factorized single variable polynomial as input and predicts its expanded sequence. Achieved Exact-Match performance of 90% while keeping the number of model parameters below 5M.

COVID Assistant Apr’21

Developed a conversation assistant to prevent disinformation during the second wave of the pandemic. Trained a DIET Transformer to de- tect user’s intent & entities and appropriately reply to the messages. The assistant could analyze user’s queries and provide them helpline numbers and location-specific hospital bed availability data to users by scraping it off official portals. Chicago Crime Data Debiasing Oct’22

Debiased chicago crime dataset by filtering out race related information from the dataset and adding alternate features. To add the features, US census data was combined with crime data to retrieve socioeconomic information of the affected neighborhoods. ACHIEVEMENTS

Co-authored paper ‘K-12BERT: BERT for K-12 education’ accepted at Artificial Intelligence in Education Conference, 2022.

Presented research paper titled ‘A MILP based GUI tool for the Single Unit Production Planning Problem’ in the International Conference on Soft Computing and Optimising Techniques, 2019.

Secured 1st position in paper presentation held during Reflux’19, IIT Guwahati for the performance analysis of sanitized Teaching-Learning Based Optimization, Particle Swarm Optimization, and Genetic algorithm on single-unit production planning problem.

Published research paper ‘Development of Hand Exoskeleton for Recuperation’ in the International Journal of Recent Technology and Engi- neering, 2019 as the first author.

Won 2nd prize for the working model of Scorpion - An All-Terrain Rescue Robot in Project Competition featuring 300+ teams. SKILLS

Python Pytorch Machine Learning Deep Learning Sagemaker NLP Numpy Pandas Computer Vision Data Modelling Docker SQL



Contact this candidate