KRITI KUMARI
+1-704-***-**** *********@*****.*** www.linkedin.com/in/kriti-kumari-connectwithkriti github.com/kritik618 Experience
Data Scientist June 2024 – Present
Rebecca Everlene Trust Company Chicago, IL
• Engineered personalized recommendation algorithms using machine learning and statistical modeling to deliver tailored relocation suggestions, improving user engagement and satisfaction.
• Leveraged GPT-4-driven prompt engineering to generate summaries from user feedback, enriching recommendation logic and enhancing interpretability.
• Conducted EDA and feature engineering on cost-of-living and academic datasets, incorporating generative text augmentation to support behavioral segmentation.
• Applied Hugging Face models for text classification to categorize qualitative inputs and utilized LangChain to extract key insights from unstructured location-based documents.
• Streamlined data preparation through automated ETL pipelines to ensure timely, consistent inputs for modeling workflows.
• Presented model-driven insights via interactive Power BI dashboards enriched with GPT-generated summaries, enabling data-informed decision-making for non-technical stakeholders. Data Scientist November 2021 – July 2022
Tata Consultancy Services Limited India
• Built automated workflows in Alteryx to ingest, clean, and unify multi-channel campaign data (email, SMS, push, web) for a Global Retail Client; performed exploratory analysis in Python to map customer journeys and identify drop-off points.
• Engineered behavioral features such as channel order, engagement depth, and conversion delay to support predictive modeling and attribution analysis.
• Trained and evaluated multiple models (Logistic Regression, Random Forest, XGBoost) to predict conversion likelihood; XGBoost delivered best performance with an F1-score of 0.81.
• Built a time-decay attribution model to measure the influence of each channel based on interaction recency and sequence, enabling accurate channel performance assessment.
• Visualized results in Power BI dashboards and collaborated with the client’s marketing and CRM teams, leading to a 14% increase in conversion rate and 7% drop in acquisition cost. Associate Data Scientist March 2020 – October 2021 Tata Consultancy Services Limited India
• Worked with large-scale banking transaction datasets to develop supervised fraud detection models and apply unsupervised segmentation for customer profiling.
• Engineered behavioral features such as transaction frequency, spend velocity, recency, and active hours to support both classification and clustering.
• Built fraud detection models using logistic regression and decision trees, reducing false positives by 12%; segmented customers using K-Means and DBSCAN to support personalized targeting.
• Evaluated model performance using precision, recall, and AUC-ROC; collaborated with domain experts and visualized insights through Power BI dashboards for risk and marketing teams. Data Analyst March 2018 – February 2020
Tata Consultancy Services Limited India
• Analyzed large volumes of patient and prescription data using MySQL, identifying trends in medication adherence, refill behavior, and cost variability across regions.
• Engineered features such as refill frequency, treatment consistency, and monthly cost patterns to support reporting and stakeholder insights.
• Optimized SQL queries using indexed JOINs and early filters, reducing query runtime by 15% and dashboard refresh time by 12%.
• Delivered timely reporting of key healthcare metrics, enabling care teams to act faster on adherence, refill compliance, and prescription cost insights.
• Developed interactive Power BI dashboards and collaborated with domain experts to ensure clinical and compliance alignment.
Technical Skills
Programming Languages & Libraries: Python (NumPy, Pandas, SciPy, Seaborn, Matplotlib, Scikit-learn, Statsmodels), SQL (Joins, Window Functions, Indexing, Partitioning) Machine Learning Techniques: EDA, Data Cleaning, Feature Engineering (missing values, outlier removal, transformations), Regression (Linear, Polynomial), Classification (Logistic, Decision Trees, KNN), Clustering (K-Means, DBSCAN, Hierarchical), Ensemble Models (Random Forest, XGBoost), Time Series (ARIMA, SARIMA, Holt-Winters) GenAI & LLM Tools: GPT-3.5/4, Prompt Engineering, LangChain, Hugging Face Transformers, Semantic Search
(FAISS), Retrieval-Augmented Generation (RAG), Document Q&A Data Engineering & ETL Tools: Alteryx, Azure Data Factory (ADF), Azure Databricks, ADLS, SSIS, Snowflake Databases: MySQL, Oracle, MongoDB (NoSQL), Snowflake (Cloud DWH) Visualization & Reporting: Power BI (DAX, dashboards, trend analysis), SSRS, SSAS, Excel (Pivot Tables, Advanced Formulas), PowerPoint, Seaborn, Matplotlib Project Management & Collaboration: Jira (task tracking, sprint planning), Confluence (project documentation), Agile methodology
Core Competencies: A/B Testing, Hypothesis Testing, RCA, Time Series Forecasting, Predictive Modeling, Statistical Analysis, CLV Modeling, Churn Prediction, Fraud Detection, Market Basket Analysis, Customer Segmentation, Behavioral Analytics, Campaign Optimization, Financial Forecasting, Patient Flow Prediction, Load Forecasting, Dashboard Creation, Data Storytelling, Data Pipeline Optimization, Scalable Model Deployment, Real-Time Analytics, Model Monitoring, Cloud-Based Processing
Education
University of North Carolina at Charlotte Charlotte, NC MS in Engineering Management 2024
Rajiv Gandhi Proudyogiki Vishwavidyalaya Bhopal, India BS in Electronics and Communication 2017
Projects
• Load Forecasting Using Temperature Data: Forecasted hourly electricity load using a multiple regression model with features like temperature, day of the week, holiday flag, and a trend index to capture time-based patterns. Used pandas for data processing, statsmodels for modeling, and matplotlib for plotting. Evaluated model performance using RMSE and MAPE, and visualized actual vs predicted values to assess accuracy.
• Patient Arrivals Forecasting in Emergency Department: Predicted daily patient arrivals across five severity levels (ES1–ES5) using models like SARIMA, ARIMA, Holt-Winters, Multiple Linear Regression, and Support Vector Regression. Each level was modeled separately, followed by a combined forecast to help hospitals plan staffing and resources. Used holiday and day-of-week as features, evaluated results using RMSE and MAPE, and visualized actual vs predicted values through plots.
• NBA Player Role Evolution – Position-Based vs. Positionless: Analyzed NBA player stats from 2004–2024 across positions using K-Means clustering to assess the shift toward positionless basketball. Conducted PCA for dimensionality reduction, standardized key metrics (e.g., PTS, AST, BLK, REB), and visualized cluster patterns per year to track trait convergence using seaborn and matplotlib.
• Temporal and Spatial Analysis of E-Scooter Trips in Charlotte: Explored scooter usage patterns by day type, provider, and trip distance class using pandas, geopandas, and KDE. Overlaid trip data on GeoJSON maps of the LYNX Blue Line and bus stops. Engineered distance bands, aggregated trip duration/speed, and used spatial heatmaps to reveal commuter vs recreational behavior trends.