Post Job Free

Resume

Sign in

Cloud Data Scientist

Location:
Brookline, MA
Posted:
April 08, 2024

Contact this candidate

Resume:

Likith Venkatesh Gowda Prathima

ad4u9c@r.postjobfree.com 857-***-**** Brookline, MA LinkedIn

EXPERIENCE

Khoury Graduate Teaching Assistant, Khoury college of computer science (CS 4120) Jan 2024 – Present Achieved a 20% increase in student interest through innovative course restructuring, focusing on Large Language Models for real-world Natural Language Processing applications. Implemented engaging activities leading to a 30% rise in student engagement and participation levels in the NLP. Data Scientist Intern, Lifesight Jan 2022 - Jul 2022 Developed Data-Driven attribution models and performed predictive and behavioral analysis on mobility, audience and location data using PySpark, and cloud data platforms (GCP BigQuery), resulting in a 24% increase in revenue. Led the development and management of CI/CD pipelines utilizing cloud data platforms and tools such as Apache Airflow. Significantly reduced Proof of Concept (POC) delivery time by over 60% by establishing seamless collaboration between Sales/CSM teams and the Engineering team. Facilitated rapid provisioning of ad-hoc data, implemented data enrichments, and delivered analytical solutions through efficient CI/CD pipelines. Worked closely with the Solutions team to explore and utilize technologies/tools such as Python, SQL, Jenkins, Hadoop, Spark, & cloud data platforms to design complex data modeling scenarios, automate processes, and perform transformations for optimal data engineering solutions thereby reducing the overall lead time by 40%. ML Engineer Intern, ML Labs Sept 2021 - Nov 2021

Contributed to a deep tech product using Python, TensorFlow, and OpenCV for real-time performance assessment of city services, enhancing service quality through AI-driven automation. Implemented a machine learning model for a US-based inventory management company, using Image Classification and Semantic Segmentation to minimize accidents and enhance workplace safety on the factory floor. EDUCATION

Northeastern University Sep 2022 – Apr 2024

Master of Science in Computer Science GPA: 3.91

Related Course: Algorithm, Natural Language Processing, Machine Learning, Program Design Paradigm, Foundations of AI, AI in Human-Computer Interaction, Database Management Systems and Mobile Application Development. Visvesvaraya Technological University (VTU) Aug 2018 – Jul 2022 Bachelors of Engineering in Information Science and Engineering GPA: 3.66 Related Course: Design and Analysis of Algorithms, Cloud Computing, Object Oriented Programming, Machine Learning & Artificial Intelligence, Operating Systems, Complex Analysis using Probability & Statistics, UNIX. SKILLS

Programming Languages: Python, R (Programming Language), Java, PostgreSQL, MySQL, UNIX. Tools: GitHub, Kubernetes, Docker, GitBash, Terraform, AWS S3, Apache Airflow, Apache Spark, Postman. Certifications: Foundations of Data Science (PadhAI), Essentials of Data Science using R software

(NPTEL), Business Analytics for Decision Making (NPTEL), Data Science for Engineers (NPTEL). Leadership: Education Support Mentor (Make A Difference), Student Body Placement Head (CMRIT), Mentor (NEU). PROJECTS

Automated Medical Recommendation System using ML & NLP for Cancer Treatment (Publication) Python, DenseNet121, Random Forest, Long Short-Term Memory, BERT, Shapley Additive Explanations (SHAP) Engineered a medical support system that analyses medical reports and radiological images to predict the occurrences of abnormalities in the body with an accuracy of around 78%, complemented by an interactive chatbot. Toxicity Tune: Dual Model approach for generating and scoring toxic comment NLTK, Tensorflow, PyTorch (Git)

• Built a dual-model approach utilizing BERT and miniGPT to identify toxic comments, achieving an impressive 85.44% accuracy through rigorous comprehensive hyperparameter tuning and optimization algorithms, batch sizes and vocabulary sizes to reduce model loss by 55%, effectively utilizing A100 GPUs for resource-efficient training.

• Implemented a robust k-fold cross-validation (k=14) methodology to rigorously evaluate 14 distinct models, achieving consistent loss values between 0.27 and 0.35, ensuring statistical reliability in assessing model performance.. Sleep Cycle estimation in wearable devices LightGBM, CatBoost, XGBoost, Random Forest, KFold CV (Git)

• Implemented a robust feature engineering pipeline for accelerometer data, incorporating key metrics such as ENMO, anglez, and their variations over specified periods, optimizing for efficient sleep onset and wakeup event detection.

• Conducted a comprehensive analysis and comparison of LightGBM, CatBoost, XGBoost, and Random Forest models for sleep cycle-based time series logs through rigorous fine-tuning and feature engineering, evaluating their individual strengths and performance characteristics in the context of wrist-worn accelerometer data. StoryGen PyTorch, DPR, DPO, QLoRA, RAG-Token

• Directed a team in achieving a 30% increase in human evaluators' ratings for narrative coherence and engagement compared to baseline models. Spearheaded the implementation of novel techniques resulting in a 25% improvement in user engagement metrics, including reader retention and social media shares.

• Instrumented an innovative approach leading to a 40% reduction in coherence errors and a 20% increase in thematic consistency, as measured by quantitative metrics.



Contact this candidate