Chandan Kumar Tripathi
MA, USA +1-508-***-**** **.**********@*****.*** LinkedIn
PROFESSIONAL SUMMARY
Experienced Data Scientist and Analytics Consultant with over 8 years of expertise spanning Insurance, Healthcare, Manufacturing, and Telecommunication. Skilled in developing and deploying machine learning models, leading a team of five data scientists and analysts in an agile environment, and transforming business challenges into AI-powered solutions.
CORE SKILLS & TECHNOLOGIES
• Programming Languages: Python, R, PySpark (Model development, Deployment)
• Database Languages: SQL (Databases – Oracle, DB2, MySQL, Teradata)
• Machine Learning: Linear Regression, Logistic Regression, Decision Tree, Random Forest, XGBoost, CatBoost, KNN, SVM, Clustering (Hierarchical, DBSCAN, KMeans), Time Series Forecasting (ARIMA), Neural Networks, NLP (NLTK, Web Scraping, Text Mining, Sentiment Analysis, Naïve Bayes Algorithm), Model Evaluation and Validation (ROC, AUC, PCA), CNN, RNN, Market Basket Analysis / Apriori Algorithm, Bayesian Algorithm, Transformers, Collaborative Filtering, Market Mix Model (MMM), Topic Modelling
• GenAI: LLM, LangChain, Hugging Face Transformers, RAG (Retrieval-Augmented Generation), Vector Stores (FAISS, Pinecone), OpenAI / GPT-4 API Integration, Prompt Engineering, LLM-based Pipelines, Unstructured Text Summarization
• Cloud Applications: AWS – Training and Deployment of Models using AWS SageMaker, AWS Lambda
• Business Intelligence & Data Visualization: Tableau, Power BI, Matplotlib, Seaborn, Plotly, KPI Dashboards, A/B Testing
• Data Analysis: EDA, Feature Engineering, ANOVA, Hypothesis Testing, IV, VIF, Parametric and Non-Parametric Tests
• Industry/Domain Expertise: Insurance, Healthcare, Manufacturing, Telecommunication
PROFESSIONAL EXPERIENCE
AIG, MA, USA
Data Scientist
Aug 2024 – Present
Project: Fraud Detection Model (General Insurance)
• Worked with a leading US-based general insurance company, serving 3.5 million active customers, to address a fraud issue impacting $2.6 billion in annual revenue losses.
• Led the development of an enhanced fraud detection model, focusing on improving detection accuracy and minimizing financial losses attributed to fraudulent claims (2% of customers annually).
• Analyzed key business drivers contributing to fraud, providing actionable insights to help define the company’s sales and business strategy.
• Worked closely with cross-functional teams to ensure the deployment of machine learning solutions that directly aligned with the company’s goals of reducing fraud and optimizing revenue.
• Enhanced the traditional insurance fraud detection model by incorporating unstructured claim notes using LLM-based summarization and red-flag extraction via LangChain & Hugging Face pipelines.
• Implemented a RAG architecture to integrate structured transaction data with contextual insights, improving fraud signal accuracy and early detection.
CitiusTech, India
Consultant – Data Science
Jan 2021 – Jul 2023
Project: Patient Readmission Prediction (Healthcare)
• Developed healthcare claims analytics models, improving model accuracy and reducing readmission rates by 10%, while optimizing risk-scoring algorithms to enhance patient care and revenue impact.
• Built NLP models using spaCy and Transformers to extract insights from unstructured clinical data, significantly improving decision-making accuracy and reducing operational costs.
• Integrated model explainability techniques (e.g., SHAP, LIME) to enhance transparency, build trust, and ensure regulatory compliance for classification models, supporting business strategy.
• Led A/B testing initiatives to optimize model performance, boosting business decision-making efficiency by 25%, while working with insurance datasets to improve predictive modeling accuracy and maintain compliance with regulatory standards.
Hexaware Technologies, India
Data Scientist
Feb 2018 – Dec 2020
Project: Involuntary Churn Prediction (Telecom Client – Australia)
• Built a machine learning-based involuntary churn prediction model for both consumer and enterprise segments of a major telecom provider in Australia, targeting retention strategies across 5.8 million active postpaid users.
• Designed a propensity-to-pay classification model to forecast delinquency risk among customers with overdue balances, enabling early intervention and improved collections.
• Enabled targeted campaigns on 35% of at-risk customers, preventing potential churn and contributing to a revenue impact of approximately $200 million AUD, while reducing bad debt exposure by identifying 4.5% of high-risk accounts annually.
• Led model development and delivery within an agile framework, ensuring alignment with business goals and efficient execution across cross-functional data science and analytics teams.
Intex Technologies, India
Data Analyst
Aug 2015 – Jan 2018
Project: Customer Segmentation
• Performed customer segmentation on high-volume retail datasets using K-means clustering to identify target groups across various sales channels, including organic, direct, and web-based sales.
Project: Analysis and Reporting
• Helped the business team with insights by analyzing data from different sources like technical and financial KPIs.
• Established the correlation between different segments of the business through exploratory analysis.
PROJECTS & RESEARCH
Face Recognition System Development
• Developed a deep learning-based face recognition system using TensorFlow and Keras, inspired by FaceNet and DeepFace, to perform biometric identification through 128-dimensional facial embeddings.
• Applied triplet loss for embedding optimization, enhancing model precision in both face verification (1:1) and recognition (1:N) tasks, achieving high reliability in real-world identity authentication scenarios.
• Demonstrated strong proficiency in neural network design, facial feature encoding, and deployment of biometric security applications using Python and advanced ML techniques.
Gen AI: NanoGPT for Song Lyrics Generation
• Fine-tuned NanoGPT, a lightweight GPT-2 model, using PyTorch for character-level song lyric generation on the Spotify dataset, leveraging transformer architecture and Byte Pair Encoding (BPE) for improved text coherence and predictive accuracy.
• Designed and implemented data preprocessing pipelines in Python, structured training-validation splits, and monitored model performance through loss reduction and accuracy tracking.
EDUCATION
• Master’s in Data Analytics, Clark University, Worcester, MA
• Bachelor of Technology (B.Tech), Bharati Vidyapeeth Deemed University, Pune, India
CERTIFICATIONS
• Advanced Certificate Programme – Data Science (April 2022), IIIT Bangalore
• Deep Learning Specialization – Deeplearning.ai
• Neural Networks & Deep Learning – Deeplearning.ai
• Improving Deep Neural Networks: Hyperparameter Tuning, Regularization, Optimization – Deeplearning.ai
• Structuring Machine Learning Projects – Deeplearning.ai
• BCG – GenAI Job Simulation
• Accenture – Data Analytics Job Simulation