Post Job Free
Sign in

Data Analyst

Location:
Chicago, IL
Posted:
July 21, 2025

Contact this candidate

Resume:

KALP PATEL

********@*****.*** • 408-***-**** • www.linkedin.com/in/drkalppatel/ • https://github.com/Kalp25740 EDUCATION

Indiana University Indianapolis MS in Health Informatics GPA: 3.7 May 2025 Maharashtra University of Health Sciences Bachelor of Dental Surgery GPA: 3.8 May 2013 TECHNICAL SKILLS

Programming Languages: Python (NumPy, Pandas, Matplotlib, Seaborn, PyTorch, Tensorflow), R Programming, SAS Data Analysis Techniques: Natural Language Processing (Spacy, NLTK, Transformers), Machine Learning, Statistical Analysis, Time-Series Regression.

Data Management Tools: SQL (MySQL), Snowflake, GitHub, NVivo, Qualtrics, REDCap, Microsoft Office (Excel, Outlook) Data Visualization Tools: Tableau, Power BI, R Shiny Healthcare Standards: CPT, SNOMED, ICD, LOINC, FHIR, HL7, HIPAA PROFESSIONAL EXPERIENCE

Graduate Research Data Analyst

Purkayastha Lab of Health Innovations, Indiana University Jan 2024 – Present

• Led data cleaning and preprocessing of 200,000+ patient records, ensuring data quality, temporal alignment, and PHI compliance; assisted with IRB documentation and approvals; performed feature extraction and developed time series predictive models to forecast wound healing trajectories, supporting early identification of high-risk patients for timely clinical intervention.

• Evaluated an AI-powered diagnostic tool in a multi-site study involving 40 clinicians; performed quantitative analysis using Python and R, qualitative analysis with NVivo software, and generated implementation reports to guide clinical decision- making.

• Integrated and preprocessed large-scale datasets from census and state health sources; performed statistical analyses and applied machine learning techniques to identify significant predictors of social determinants of health, translating results into actionable reports and presentations to support public health decision-making.

• Designed and implemented end-to-end data pipelines to merge, clean, and analyze 18M+ clinical records from the MIMIC-IV database; applied statistical modeling and time-series analysis to predict patient outcomes as part of an app aimed at identifying children at high risk of mortality and supporting early clinical intervention.

• Designed and implemented an NLP pipeline to extract key topics from 214 student capstone project reports, leveraging BERT for entity tagging and Gensim with LDA for topic modeling. Achieved a 70% coherence score, identifying prevalent themes at the university and ensuring topic model reliability. Informatics Administrative Intern

Clinical Architecture May 2024 – Aug 2024

• Standardized and integrated healthcare data from disparate EMR systems to ensure interoperability; conducted data analyses on 200+ operational issues using Microsoft Excel and prepared comprehensive reports highlighting trends and actionable insights to enhance customer experience and drive quality management initiatives. Senior Dentist

Dental Inn April 2018 – Oct 2022

• Successfully managed dental practice operations and modernized patient record management by integrating a new Electronic Health Records (EHR) system, resulting in improved data accessibility.

• Performed SQL-based abstraction and review of 1,000+ patient records to identify care trends and optimize appointment scheduling, increasing operational efficiency by 20%. PUBLICATIONS

The Impact of Social Determinants on Cardiovascular Mortality: A Zip Code-Level Analysis in Indiana – Accepted Full Paper, MedInfo2025, International Medical Informatics Association (IMIA), August 2025 ACADEMIC PROJECTS

Healthcare Diabetes Risk Prediction (Python)

• Developed predictive machine learning models for diabetes risk assessment, achieving 98% accuracy, applying advanced data preprocessing and statistical analysis techniques to identify critical predictive factors for early diagnostic intervention. Statistical Analysis of Factors Associated with Obesity (R Studio)

• Evaluated 2,000 records using R Studio, applying descriptive statistics, regression modeling, and categorical data analysis to identify significant obesity correlations (p < 0.05). Generated comprehensive visualizations to communicate key findings. Healthcare Diabetes Readmission Analysis (SQL, Excel, Tableau)

• Analyzed 35,523 patient records from 130 US hospitals (1999-2008) using SQL, identifying key factors influencing 30-day readmission for diabetic patients, and created interactive dashboards using Tableau to visualize trends, optimize hospital costs.



Contact this candidate