Kumar Kishalaya
San Francisco, CA 564-***-**** ***************@*****.*** LinkedIn Github Medium AWS SUMMARY
Experienced Data Scientist with 6+ years in a fast-growing E-recruitment Start-up (SaaS). Master’s degree from UC Davis in Analytics. Specialize in A/B testing, Marketing & Product Analytics, ML algorithms, Recommendation systems, NLP, GenAI & RAG- based solutions. Strong communication & collaboration skills with proven record in building and leading a data team. WORK EXPERIENCE
Data Scientist (Internship), UC Davis Health, Sacramento, CA Sep 2023 – Jun 2024
• Developed RAG-based customer support chatbot to enhance query response accuracy and speed.
• Developed a classification model to predict late arrivals/no-show patients, achieving ~80% F1 score. Devised intervention strategies based on ML model prediction leading to a 10% increase in provider-patient facetime.
• Designed & built 5+ Tableau dashboards to monitor clinic KPIs, improving planning & analysis time by 8-10 hours weekly. Senior Data Scientist (Team Lead), Internshala, Gurugram, India Aug 2021 – Apr 2023
• Collaborated with senior management in key decision-making, helping shape product roadmap and develop yearly business projections & goals by quantitative analysis of historical data from 2019 to 2022, achieving 95% projection accuracy.
• Built and maintained 15+ KPI dashboards empowering buisness team to track vital metrics like ARPU, Churn, and CLTV-CAC, and recognizing potential improvements to optimized subscription-based model and a ~20% increase in revenue.
• Designed and conducted 10+ A/B tests & root cause analysis on key metrics fixing major product leaks and improving CTR, user acquisition/retention, resulting in an increase in engagement by ~30% over a 2-year period.
• Built Recommendation Engine to enhance personalization on ATS. Leveraged a pre-trained word2vec model to extract textual features combined with numerical features resulting in an NDCG score of 0.85 and a 10% decrease in hiring time.
• Mentored & managed 5 direct reports on 15+ analytics projects over 2 years; facilitated cross-industry meetups for knowledge sharing & talent attraction.
Data Scientist, Machine Learning, Internshala, Gurugram, India Sep 2018 – Jul 2021
• Engineered a content-based recommendation model to match relevant candidates with jobs. Utilized a combination of vector- based similarity and heuristic methods and used REST API for deployment, reducing hiring time by ~15%.
• Developed NLP (smart reply) model to increase chat response rate by 35%, boosting a platform level engagement. Leveraged LDA to generate labels and a CNN+LSTM layer for prediction achieving a top 3-accuracy of ~80%.
• Implemented an ensemble tree-based text classifier for identifying messages requiring response, achieving a 20%+ improvement in chat initiation rates. Utilized pre-trained self-attention-based BERT embeddings as textual features.
• Built a fraud detection classifier using BOW and TF-IDF, reducing malicious employer chat messages by 50% and achieving superior inference time compared to deep learning models with similar accuracy. Marketing Analyst, Internshala, Gurugram, India Jan 2017 – Aug 2018
• Utilized advanced SQL queries to structure and fulfill key data requirements with over 95% accuracy, ensuring smooth day- to-day operations across multiple departments and driving informed decision-making for effective marketing strategies.
• Responsible for marketing analysis & projection for B2B channel and led external partner meetings aiding growth and partnership with 2000+ colleges in India resulting in ~500K monthly user registrations at its peak. PROJECTS
• Finetuning LLMs –Next word prediction, chatbot, Q/A and translation on RNN, LSTM, finetuned GPT-2, Llama-2-7b
• BERT – Analyzed the impact of transfer-learning for sequential recommendation on books dataset using only the book title.
• RAG – Engineered Agentic-RAG for structuring and retrieving research paper summary, outperforming naive RAG significantly. TECHNICAL SKILLS
Languages: Python, SQL, R, SAS, PyTorch, Tensorflow, Keras, Unix/Linux Database/Tools: Excel, RDB, Tableau, Power BI, Pandas, Numpy, Scikit, NLTK, SpaCy, SQL Server, Git, LangChain, Hugging Face Statistics: Hypothesis Testing, Causal Inference, Multivariate Testing, Non-Parametric tests, Survival Analysis, Probability Machine Learning: Statistical Modeling, Feature Engineering, Regression, Classification, Clustering, Support Vector Machines, Recommender Systems, PCA, KNN, Decision trees, Random Forest, GBM, XGBoost, ARIMA, ML Ops, Deep Neural Networks DL/NLP: CBOW, TF-IDF, Word2Vec, CNN, RNN, LSTM, Transformers, Prompt Engineering, Finetuning LLMs, LoRA/QLoRA, RAG Parallel Processing: ETL, Kinesis, Apache Spark, PySpark, S3, Kafka, GCP, AWS, Azure, Redshift, Hadoop, AWS Sagemaker EDUCATION
University of California, Davis, CA, USA Jul 2023 – Jun 2024 Master of Science in Business Analytics
Kurukshetra University, Haryana, India Aug 2012 – Jun 2016 Bachelor of Technology in Mechanical Engineering