Post Job Free
Sign in

Machine Learning Data Scientist

Location:
Hayward, CA
Posted:
February 17, 2025

Contact this candidate

Resume:

Tarun PASUMARTHI

ML Engineer and Data Scientist

linkedin.com/in/TarunPasumarthi github.com/TarunPasumarthi

510-***-**** ********@*****.***

EDUCATION

Dec 2020 Georgia Institute of Technology

Masters of Science: Computer Science Specialization: Machine Learning GPA: 3.9 Relevant Coursework: Deep Learning, Machine Learning for Trading, Data and Visual Analytics, Computer Vision, Natural Language Processing, Web Search and Text Mining, Graduate Algorithms Jun 2019 University of California San Diego (UCSD) Bachelor of Science: Computer Engineering Minor: Cognitive Science Major GPA: 3.7 Relevant Coursework: Machine Learning, Computer Architecture, Systems Programming, Object-Oriented Programming, Data Structures, Algorithms, Front-end Development, Software Engineering Teaching Assistant: Frontend Development (CSE 170)

TECHNICAL SKILLS

Languages Python, Java, C/C++, JavaScript, SQL

ML Libraries NumPy, Pandas, SciPy, Scikit-learn, Pytorch, NLTK, OpenCV, TensorFlow, Keras ML / Data Tools Apache Spark, Hadoop, Docker, Kubernetes, AWS S3, AWS Lambda, SageMaker, ChromaDB, MLFlow Web/Mobile Dev Android Studio, React Native, Angular, Node

PROFESSIONAL EXPERIENCE

January 2025 Applied AI Scientist Datadog, NEW YORK, NY September 2024 EnhancedaGenAI-poweredcodingagent,leveragingchain-of-thought reasoningandtoolusetogen- erate root cause analyses and bug fixes, minimizing resolution time for customer production errors

Built a Django web application to track evaluation progress, analyze agent conversation logs, and enable agent replays, allowing better performance visibility and facilitating prompt optimizations

Constructed an llm-driven pipeline to create and run custom evaluation datasets identifying well scoped bug fixes and unit tests from Datadog code repositories, enabling in-distribution evaluation and improving model benchmarking accuracy

Python GO LLM Agents Generative AI Prompt Engineering Django Streamlit PostgreSQL March 2024 ML Engineer eBay, SAN JOSE, CA

March 2021 Developed and deployed a tax category predictor for e-commerce listings using a BERT based ar- chitecture for multi-modal text and image classification, responsible for collecting over $2.5 billion quarterly in taxes across all major eBay sites

Implemented micro services for model inferencing and integrated them into the back-end pipeline

CreatedaBERT-basedharmonizedsystemcodeclassifierforcollectingtariffsandduties,using a LoRA fine-tuned LLaMA model and RAG for domain specific auto-labeling on training data, addressing the cold start problem, our biggest pain point

Won 2nd place in the eBay Annual ML Hackathon 2022- Payment method ranker POC Python BERT Pytorch Hadoop Spark Java LLMs Langchain PEFT RAG August 2020 Data Engineering Intern Ultimo Software Solutions, SAN JOSE, CA June 2020 Performed data ETL and Analysis on Delta Dental’s grievances and appeals and customer service touch points, gathering data from various systems to provide a 360 view of customer interaction

Crafted dashboards to easily visualize customer interaction data Tableau Python Data Analysis AWS S3

June 2019 Lead Android Developer Felicity, SANDIEGO, CA Feb 2018 Co-founded a UCSD-incubated startup based on Cognitive Behavioral Therapy (CBT), a self-help in- tervention for depression and anxiety

Led a team of 5 developers in designing, developing, and publishing an Android application that provides easy access to CBT therapy while also implementing user authentication and ensuring en- crypteddata storage

Java Android Studio Firebase RSA Encryption

FEBRUARY 14, 2025 TARUN PASUMARTHI - RESUME 1

INDEPENDENT PROJECTS

VETERAN SUICIDE ANALYSIS 2018

Performed exploratorydataanalysis to analyze how socio-economic factors affect veteran suicide rates

Won 1st place in the Halicioglu Data Science Institute Inauguration Competition ExploratoryDataScience Python NumPy Pandas Matplotlib Plotly REDDIT AUTO-MODERATION 2020

Mined 14 political subreddits and created visualizations to depict their polarity via sentiment analysis

Implemented an auto-moderation pipeline based on contextual sentence completion for polarity detection utilizing (at the time) state-of-the-art language models, ULMFIT and GPT2 Deep Learning NLP Large Language Models Data Mining NLTK Tensorflow PRAW (Python Reddit API Wrapper) COMPARING THE PERFORMANCE OF GENERATIVE MODELS 2020

Designed an evaluation pipeline to quantitatively measure the performance of variational auto-encoders and generative adversarial networks on image generation using a pretrained convolutional neural network to yield accuracy metrics Machine Learning Deep Learning Variational Auto Encoders (VAE) Generative Adversarial Networks (GAN) Pytorch VARIATIONAL AUTO-ENCODER CLASSIFIER 2020

Implemented a novel classification approach utilizing the latent space of a pretrained VAE as input for various deep classifier, yielding significant improvement in training time while sustaining accuracy ExploratoryDeepLearning Pytorch Keras

FLIGHT TRACKER 2021

Created a web-based, open-architecture application that provides granular, interactive visualizations of themostcurrent air traffic and airport flight volume congestion statistics across user selected time periods Data Visualization Web Development D3.js Plotly Tableau OpenSky API Google Maps API FANTASY FOOTBALL DATA ANALYSIS 2021

Developed a player scoring consistency metric using the Sortino ratio and performed correlation analysis with playoff occurrence percentage

Implemented hierarchical clustering based on the consistency metric to derive insights on player similarity in order to determine value picks in fantasy football drafts

ExploratoryDataScience Data Visualization Pandas Matplotlib Plotly Scikit-learn

ACADEMIC PROJECT PAPERS

[1] Baid, A., Bhardwaj, A., Pasumarthi, T., (April 2020). REDDIT AUTO-MODERATION BY EVALUATING COMMUNITY OPINION, College of Computing, Georgia Institute of Technology. 2020.

[2] Jayaseelan, A., Pasumarthi, T., (December 2020). EXAMINATION OF VAE LATENT VECTOR BASED CLASSIFICATION, College of Computing, Georgia Institute of Technology. 2020.

FEBRUARY 14, 2025 TARUN PASUMARTHI - RESUME 2



Contact this candidate