Post Job Free
Sign in

Data Science

Location:
New York City, NY
Posted:
April 24, 2025

Contact this candidate

Resume:

PEI YI (CINDY) LI

+1-929-***-**** ******@********.*** www.linkedin.com/in/pei-yii New York

EDUCATION

Columbia University New York, NY

M.S. in Engineering Aug 2024 – Expected Dec 2025

● Courses: Data Analytics, Optimization Models, Process Improvement, Strategy in Pharmaceuticals, Data Mining University of British Columbia Vancouver, BC

B.S. in Computer Science and Biology GPA: 3.8/4.0. Sept 2018 – Dec 2022

● Courses: Advanced Relational Databases, Introduction to Artificial Intelligence, Machine Learning, Genetics. SKILLS

Languages: Python (Numpy, Pandas, Matplotlib, Sympy, scikit-learn), SQL, Java, R, VBA, Matlab, Excel Others: MySQL, MongoDB, PostgreSQL, Tableau, AWS, API usage WORK EXPERIENCE

Regeneron Pharmaceuticals New York, NY

Data Science Student Intern Jan 2025 - May 2025

● Built end-to-end machine learning models (XGBoost, Random Forest) to estimate project timelines and identify failure risks using structured data (disease condition, trial design, eligibility filters) from ClinicalTrials.gov.

● Applied model interpretability techniques (SHAP) to surface key duration drivers; identified that >12 eligibility criteria, multi-country recruitment, and rare disease trials correlated with prolonged timelines.

● Deployed an interactive Gradio dashboard where users input trial design parameters and receive real-time model predictions, enabling faster feasibility assessment and scenario testing. YF Technology LTD Vancouver, BC

Data Analyst Aug 2022 - Aug 2023

● Delivered actionable reports using Tableau dashboards by monitoring key performance indicators in TikTok campaign to maintain client relationships and provide actionable insights to redesign influencer campaign to align with target audience.

● Created targeted advertising copy based on Click Through Rate (CTR) models using Python, collaborated closely with marketing team to deliver and execute model, resulted in a notable 15% increase in average CTR for campaigns.

● Implemented a predictive model using machine learning to analyze customer engagement on TikTok ads, partnering with the marketing team to refine audience segmentation and targeting strategies, increasing ad targeting accuracy by 22%. Fantuan Delivery Vancouver, BC

Software Quality Assurance Analyst Intern Jan 2021 - Jan 2022

● Validated performance of the Fantuan App across platforms: Web, iOS, and Android, by conducting integration test cases, scenarios, and scripts. Achieved comprehensive test coverage of 100%.

● Executed over 600 manual and automated test cases for new app features, identifying and resolving 120 critical bugs before launch, including exploratory testing for edge cases and unexpected behaviors, utilizing test case management tools like Jira.

● Constructed test documentation encompassing test cases, test data, and instructions for setting up the test environment with emphasis on reproducibility and knowledge transfer among team members. SRK Consulting Vancouver, BC

Data Analyst Intern Jun 2019 - Sep 2019

● Applied Excel skills including pivot tables, charts, and statistical functions, to examine collected soil data and identify patterns and trends, resulting in a 20% reduction in data analysis time.

● Drafted project documentation by integrating soil data and its implications, assisted in data cleaning protocols that reduced errors in soil sample data by 40%, and leveraged Python and R-Studio for data integration into technical reports. PROJECT EXPERIENCE

Tumor AblationPlanner - Memorial Sloan Kettering New York, NY

● Engineered a 3D Slicer module to generate and align ellipsoidal ablation volumes at the applicator tip based on user-defined input

(power, duration, applicator type) for lung tumor treatment planning.

● Constructed real-time 3D surface meshes using VTK to visualize predicted ablation zones, enabling interventional radiologists to assess thermal coverage and proximity to critical structures.

● Integrated clinical ablation parameters into a user-facing interface and dynamic rendering pipeline, in collaboration with MSK clinicians and imaging scientists.

Healthcare Insurance Plan Recommender (Python, Next.js, Groq AI) - Devfest 2025 Award Winning

● Developed a full-stack ML-powered insurance plan recommendation system, mapping demographic inputs to optimized health plans using CatBoost (achieved best model performance: R > 0.85).

● Engineered a chatbot interface leveraging Groq and Retrieval-Augmented Generation (RAG) to answer real-time insurance queries with structured and unstructured data retrieval.

● Designed and deployed the backend architecture for fast plan scoring, clustering, and output formatting; collaborated closely with frontend team (Next.js) to ensure seamless UX.



Contact this candidate