SHRINIDHI RAJESH, MS, GStat
Statistics Data Science Machine Learning
Results-driven Data Scientist with a Master's in Applied Statistics and GStat accreditation, transforming complex data into critical insights. Demonstrated expertise in delivering high-impact solutions that drive organizational performance and innovation. Committed to pushing analytical boundaries and solving critical business challenges. EDUCATIONAL QUALIFICATIONS
Master of Science in Applied Statistics DePaul University January 2023 - November 2024 Chicago Minor in Biostatistics and Data Science – GPA: 4.0/4.0 Graduated with Distinction Bachelor of Technology in Bioinformatics SASTRA University June 2013 - May 2017 Thanjavur EXPERIENCES
Data Scientist One Tail at a Time January 2025 – Present Chicago
• Redesign data infrastructure with a streamlined pipeline (Amazon S3, PostgreSQL, KNIME, Figma), replacing manual Excel tasks and establishing a single source of truth for efficient data management and modeling.
• Build predictive models (Regression, Clustering, Time Series, Survival Analysis) to forecast adoption success and flag at-risk cases, enabling better placements and timely interventions.
• Develop a volunteer analytics framework using statistical methods and activity-based tiering, identifying retention patterns to optimize onboarding, recognition, and scheduling.
• Apply NLP to open-ended responses from volunteer surveys, adoption applications, and return forms, uncovering recurring themes that shape adopter experience and strengthen engagement.
• Create Power BI dashboards to visualize trends in adoption and volunteer metrics, providing actionable insights and trustworthy reporting tools, shifting organizational culture towards proactive, insight-driven decision-making. Statistician DePaul University January 2023 – April 2025 Chicago
• Delivered actionable insights across 17 data projects over 2.5 years, leveraging R, SAS and Python to streamline analytical workflows and accelerate outcomes that shaped strategy, funding priorities, and operational planning.
• Applied linear mixed-effects models and other statistical techniques to assess nurses' intubation performance, leading to updated airway management protocols in critical care, boosting patient outcomes and staff efficiency.
• Formulated and evaluated 125+ predictive models to assess financial instrument attributes, refining audit risk assessment procedures to yield accurate forecasts.
• Performed factor analysis, regression, and sentiment analysis on mental health survey data, identifying key drivers of student experience, which led to a 10% increase in student satisfaction through targeted programs.
• Translated complex statistical findings into clear reports and visualizations, helping clients grasp data insights by approximately 30% more effectively and supporting more informed strategic decision-making. Technical Lead IBM February 2018 - August 2021 Bangalore
• Led the development and automation of data solutions, Enterprise Compliance Management, elevating malware detection and reducing configuration drift risks by 40% across APAC, strengthening security and operational resilience.
• Built internal tools with Python, SQL, Perl, and HTML to automate routine tasks, enhancing system security and reducing manual workload.
• Streamlined CI/CD pipelines with Ansible, driving a 25% increase in data ingestion and model training by automating and managing infrastructure.
• Achieved 23% faster data retrieval and higher reporting accuracy by designing SQL schemas, tuning queries, and implementing stored procedures.
• Facilitated collaboration with teams across North America and Europe to advance audit compliance tracking, resulting in a 12% gain in compliance scores through better team coordination and process alignment.
• Managed end-to-end data migration to IBM Cloud and integrated Git with Travis CI for automated deployment, accelerating delivery timelines and minimizing errors.
• Maintained 100% uptime for enterprise frameworks during the IBM-Kyndryl split, ensuring uninterrupted service delivery during the critical organizational transition.
• Optimized DevOps workflows using Docker and advanced preprocessing techniques, improving application security, accelerating deployment, and enhancing compliance tracking. 312-***-****
************@*****.***
www.linkedin.com/in/shrinirajesh
www.shrinirajesh.com
PROJECTS
An Advanced RAG System for Sustainable Technologies September 2024 - December 2024
• Constructed a Retrieval-Augmented Generation (RAG) system to extract tactical insights from 20,000+ clean technology news articles, expanding access to sustainability intelligence.
• Implemented NLP techniques including data preprocessing, spaCy-based tokenization and semantic analysis, and integrated OpenAI Turbo 3.5 with LangChain to sharpen document understanding.
• Ensured high-quality summarization and information retrieval for clean tech insights by evaluating system performance using RAGAS, ROUGE, Perplexity, and BERTScore and quantitative assessments. A Data-Driven Population-Based Targeted Intervention for Diabetes April 2024 - June 2024
• Focused on early diabetes detection and prevention using a population health approach on 250,000+ patient records.
• Trained ML models in scikit-learn and used Pandas to process data, build risk dashboards, and align outputs with CDC guidelines.
• Enabled timely identification of at-risk individuals and supported program planning through measurable gains in early detection and community-level intervention outcomes. SKILLS
Programming Languages
• Python, R, SAS, SQL, Perl, HTML
Libraries & Frameworks:
• Statistical & Data Science: numpy, pandas, seaborn, matplotlib, scipy, statsmodels, caret, MASS, rjags, glm, tidyverse
• Machine Learning & AI: scikit-learn, PyTorch, spaCy
• Time Series: forecast, prophet
• Survival Analysis: survival, lifelines
• Mixed Effects Models: nlme, lme4
Techniques & Methodologies:
• Predictive Modeling: Linear Regression, Multivariate Regression, Generalized Linear Models, Classification, Time Series and Forecasting, Bayesian Inference, Ensemble Methods
• Data Mining: Data Preprocessing, Principal Component Analysis, Feature Engineering, Clustering
• Experimentation and Evaluation: Randomized Controlled Trials, A/B Testing, Cross Validation
• Biostatistics: Survival Analysis, Drug Design & Clinical Research, Epidemiology, Genomic Data Science
• Machine Learning & AI: Deep Learning, Natural Language Processing (NLP), Generative AI (GenAI), Large Language Models (LLMs) Tools & Software:
• Programming & Analysis: Jupyter Notebook, Anaconda
• Reporting and Visualization: Power BI, RShiny
• Cloud Services: AWS, Azure Machine Learning Studio, Azure AI
• Version Control: GitHub
• API Integration: Postman
Databases & Data Management
• PostgreSQL, IBM DB2
CERTIFICATIONS
• NVIDIA Certified Associate: Generative AI and LLMs March, 2025
• Graduate Statistician (GStat) Accreditation American Statistical Association, ASA January 2025
• Azure AI Fundamentals Microsoft December 2024
• Data Science Professional Data Camp August 2024
• AI in Healthcare Stanford Online July 2024
HONORS AND AWARDS
• Mathematical Scholarship Award DePaul 2023, 2024
• SPARKS Award IBM August 2020
• Delivery Excellence Award IBM August 2021
• Manager’s Choice Award IBM 2019, 2020, 2021