Tarun PASUMARTHI
ML Engineer and Data Scientist
linkedin.com/in/TarunPasumarthi github.com/TarunPasumarthi
510-***-**** ********@*****.***
EDUCATION
Dec 2020 Georgia Institute of Technology
Masters of Science: Computer Science Specialization: Machine Learning GPA: 3.9 Relevant Coursework: Deep Learning, Machine Learning for Trading, Data and Visual Analytics, Computer Vision, Natural Language Processing, Web Search and Text Mining, Graduate Algorithms Jun 2019 University of California San Diego (UCSD) Bachelor of Science: Computer Engineering Minor: Cognitive Science Major GPA: 3.7 Relevant Coursework: Machine Learning, Computer Architecture, Systems Programming, Object-Oriented Programming, Data Structures, Algorithms, Front-end Development, Software Engineering Teaching Assistant: Frontend Development (CSE 170)
TECHNICAL SKILLS
Languages Python, Java, C/C++, JavaScript, SQL
ML Libraries NumPy, Pandas, SciPy, Scikit-learn, Pytorch, NLTK, OpenCV, TensorFlow, Keras ML / Data Tools Apache Spark, Hadoop, Docker, Kubernetes, AWS S3, AWS Lambda, SageMaker, ChromaDB, MLFlow Web/Mobile Dev Android Studio, React Native, Angular, Node
PROFESSIONAL EXPERIENCE
January 2025 Applied AI Scientist Datadog, NEW YORK, NY September 2024 EnhancedaGenAI-poweredcodingagent,leveragingchain-of-thought reasoningandtoolusetogen- erate root cause analyses and bug fixes, minimizing resolution time for customer production errors
Built a Django web application to track evaluation progress, analyze agent conversation logs, and enable agent replays, allowing better performance visibility and facilitating prompt optimizations
Constructed an llm-driven pipeline to create and run custom evaluation datasets identifying well scoped bug fixes and unit tests from Datadog code repositories, enabling in-distribution evaluation and improving model benchmarking accuracy
Python GO LLM Agents Generative AI Prompt Engineering Django Streamlit PostgreSQL March 2024 ML Engineer eBay, SAN JOSE, CA
March 2021 Developed and deployed a tax category predictor for e-commerce listings using a BERT based ar- chitecture for multi-modal text and image classification, responsible for collecting over $2.5 billion quarterly in taxes across all major eBay sites
Implemented micro services for model inferencing and integrated them into the back-end pipeline
CreatedaBERT-basedharmonizedsystemcodeclassifierforcollectingtariffsandduties,using a LoRA fine-tuned LLaMA model and RAG for domain specific auto-labeling on training data, addressing the cold start problem, our biggest pain point
Won 2nd place in the eBay Annual ML Hackathon 2022- Payment method ranker POC Python BERT Pytorch Hadoop Spark Java LLMs Langchain PEFT RAG August 2020 Data Engineering Intern Ultimo Software Solutions, SAN JOSE, CA June 2020 Performed data ETL and Analysis on Delta Dental’s grievances and appeals and customer service touch points, gathering data from various systems to provide a 360 view of customer interaction
Crafted dashboards to easily visualize customer interaction data Tableau Python Data Analysis AWS S3
June 2019 Lead Android Developer Felicity, SANDIEGO, CA Feb 2018 Co-founded a UCSD-incubated startup based on Cognitive Behavioral Therapy (CBT), a self-help in- tervention for depression and anxiety
Led a team of 5 developers in designing, developing, and publishing an Android application that provides easy access to CBT therapy while also implementing user authentication and ensuring en- crypteddata storage
Java Android Studio Firebase RSA Encryption
FEBRUARY 14, 2025 TARUN PASUMARTHI - RESUME 1
INDEPENDENT PROJECTS
VETERAN SUICIDE ANALYSIS 2018
Performed exploratorydataanalysis to analyze how socio-economic factors affect veteran suicide rates
Won 1st place in the Halicioglu Data Science Institute Inauguration Competition ExploratoryDataScience Python NumPy Pandas Matplotlib Plotly REDDIT AUTO-MODERATION 2020
Mined 14 political subreddits and created visualizations to depict their polarity via sentiment analysis
Implemented an auto-moderation pipeline based on contextual sentence completion for polarity detection utilizing (at the time) state-of-the-art language models, ULMFIT and GPT2 Deep Learning NLP Large Language Models Data Mining NLTK Tensorflow PRAW (Python Reddit API Wrapper) COMPARING THE PERFORMANCE OF GENERATIVE MODELS 2020
Designed an evaluation pipeline to quantitatively measure the performance of variational auto-encoders and generative adversarial networks on image generation using a pretrained convolutional neural network to yield accuracy metrics Machine Learning Deep Learning Variational Auto Encoders (VAE) Generative Adversarial Networks (GAN) Pytorch VARIATIONAL AUTO-ENCODER CLASSIFIER 2020
Implemented a novel classification approach utilizing the latent space of a pretrained VAE as input for various deep classifier, yielding significant improvement in training time while sustaining accuracy ExploratoryDeepLearning Pytorch Keras
FLIGHT TRACKER 2021
Created a web-based, open-architecture application that provides granular, interactive visualizations of themostcurrent air traffic and airport flight volume congestion statistics across user selected time periods Data Visualization Web Development D3.js Plotly Tableau OpenSky API Google Maps API FANTASY FOOTBALL DATA ANALYSIS 2021
Developed a player scoring consistency metric using the Sortino ratio and performed correlation analysis with playoff occurrence percentage
Implemented hierarchical clustering based on the consistency metric to derive insights on player similarity in order to determine value picks in fantasy football drafts
ExploratoryDataScience Data Visualization Pandas Matplotlib Plotly Scikit-learn
ACADEMIC PROJECT PAPERS
[1] Baid, A., Bhardwaj, A., Pasumarthi, T., (April 2020). REDDIT AUTO-MODERATION BY EVALUATING COMMUNITY OPINION, College of Computing, Georgia Institute of Technology. 2020.
[2] Jayaseelan, A., Pasumarthi, T., (December 2020). EXAMINATION OF VAE LATENT VECTOR BASED CLASSIFICATION, College of Computing, Georgia Institute of Technology. 2020.
FEBRUARY 14, 2025 TARUN PASUMARTHI - RESUME 2