Post Job Free
Sign in

Machine Learning Data Scientist

Location:
Boston, MA
Posted:
November 13, 2024

Contact this candidate

Resume:

PRATIK P SANNAKKI

857-***-**** ****************@*****.*** LinkedIn GitHub My Portfolio Kaggle EDUCATION

Northeastern University Boston, Massachusetts

Master of Science in Computer Engineering Expected May 2024

• Supervised ML, Unsupervised ML, Knowledge Graphs with LLMs, Machine learning in Fintech, Data Management and Database Design, Algorithms

Bangalore Institute of Technology Karnataka, India Bachelor of Engineering in Computer Science Engineering August 2020 SKILLS Programming Language Python, Cypher, C++, Apache Nifi expression language, Java, Node JS, HTML, JavaScript, C, SQL Tool Neo4j,AWS BedRock, AWS Sagemaker, S3, Snowflake, Azure, Docker, Power BI, Jupyter Libraries Lang Chain, RAGAS, LangSmith, AutoGluon, Keras, Pandas, Matplotlib, Scikit-learn, Pycaret Database technology MySQL, InfluxDB, MongoDB, Hadoop WORK EXPERIENCE

ABBVIE Inc. California, USA

Data Scientist July 2024 – Present

• Designed and implemented a comprehensive evaluation framework for a RAG-based Gen AI bot, quantifying performance of 7 specialized agents to perform tasks including text-to-SQL generation, document lookup summarization and Data Visualization.

• The framework assessed agent outputs individually on Retrieval and Generative accuracy ensuring each component’s alignment with user intent & query requirements. Employing frameworks like RAGAS and techniques LLM-as-Judge to improve the bot’s performance by 30%

• Developed metrics and end-to-end validation strategies to evaluate overall bot response quality, combining individual agent outputs and final responses. Utilized customized scoring for SQL accuracy, summarization relevance, and holistic response coherence, leading to improved agent synergy and enhanced user satisfaction in complex information retrieval tasks.

• Implemented a user input-to-canned question matching feature in a Gen AI bot using vector stores to map inputs to predefined question- answer pairs. Explored techniques like encoding, matching, and chunking to maximize accuracy by 20% and reduce latency by 10%, resulting in faster, more relevant responses and an improved user experience.

• Monitored and analyzed LLM performance metrics using LangSmith, leveraging trace logging and detailed analysis to identify patterns, bottlenecks, and optimization opportunities.

• Applied Markov Chain analysis with multitouch attribution to map and analyze customer journeys, enhancing insights into conversion paths and optimizing marketing channel performance. Successfully identified key touchpoints, improving data-driven decision-making for campaign targeting and budget allocation.

Data Science Intern July 2023 - December 2023

• Proposed a solution to help identify and retain potential consumers leaving a subscription-based service

• Collaborated with stakeholders to understand business requirements and designed a Crisp-DM process model-based Data Science Application to identify consumers churning the subscription-based service within 24 hours of their first usage using ensemble techniques, increasing retention by 300 Bps

• Identified and devised an alternate GraphML by employing “word of mouth” as an influential factor for consumer churn, utilizing KNN and degree centrality to establish relationship between consumers, the approach improved the churn model's performance by 10%

• Derived insights into existing data to identify potential treatments (Discounts, offers etc.) to help retain consumer

• Utilized Uplift modeling to segment consumers into groups and identify potential persuadable consumers for marketing campaigns, increasing overall revenue and consume retention

• Represented Allergan Data Labs at AbbVie's Data Science Conference by presenting a poster on Consumer Identification and Retention Models

NTT PVT LTD. Karnataka, India

Associate Software Development Engineer September 2020 - August 2022

• Developed with a team of 9 all-in-one networking assets monitoring website, resulting in over 10,000+ physical and virtual networking assets being managed and controlled from a single point

• Wielded Apache Nifi to Model, Debug and troubleshoot Data Pipelines for ETL on RESTful APIs, fire tickets on Service Now’s ITSM and load data into databases like Influx DB and MongoDB

• Onboarded 3 clients to network assets monitoring platform, increased company revenue by 11.35% in financial year Q4 2021

• Mentored, guided and presented knowledge training sessions to 6 fresh joiners on core tools and technologies DEFENCE RESEARCH DEVELOPMENT ORGANIZATION Karnataka, India Research Assistant & Machine Learning Intern July 2019 - August 2020

• Performed Time series Anomaly detection using LSTMs on aircraft-simulated propulsion system data to detect abnormalities. The proposed solution reduced the risk of engine failure by 20% compared to the short comes of RNN

• Formulated a Data Analysis Application to generate Aircraft Flight Descriptive Statistics report using PYQT5 and pandas library PROJECTS

RESUME-ANZ January 2024 - Present

• Engineered a sophisticated chatbot to analyze candidate profiles comprehensively, surpassing surface-level resume querying.

• Implemented context-based PaLm LLM to build knowledge graphs from Unstructured Data (Resume PDFs), alongside grounded an additional LLM by fine-tuning for optimized English-to-Cypher query generation. Served the Fine-Tuned LLM with Knowledge Graph utilizing Retrieval Augmented Generation(RAG) to develop a user-friendly question-and-answer interaction bot using Gradio.

• Evaluated the bot performance with sample size of 100 resumes and 50 QnA responses with a success rate of 75%



Contact this candidate