Shuxin Huang
New York, NY • 845-***-**** • ******@********.***
EDUCATION
Columbia University New York, NY
MS in Computer Science, GPA: 3.81 Dec 2024
University of Sydney Sydney, AU
BS in Advanced Computing (First-Class Honors), GPA: 3.94 Jan 2023 TECHNICAL SKILLS
Programming Languages: Python, R, SQL, JavaScript, HTML5/CSS, Java, Node.js, PHP, Ruby Data Analysis: Scikit-Learn, NumPy, Pandas, Matplotlib, Seaborn, Plotly, Keras, PyTorch Statistical Analysis: A/B Testing, Hypothesis Testing, Time Series Forecasting PROFESSIONAL EXPERIENCE
Columbia University Irving Medical Center New York, NY Research Assistant (Data Engineer) Sep 2024 – Jan 2025
• Enhanced an LLM-based classification system for the Joint Cohort Explorer (JCE), improving attribute categorization accuracy by 30% through domain-specific dataset integration.
• Developed an interactive data visualization dashboard using Django and Plotly-Dash, enabling real-time filtering, drill-down analytics, and interactive cohort exploration for biomedical researchers.
• Optimized classification accuracy by implementing fine-tuning on transformer-based models, few-shot learning, and Retrieval-Augmented Generation (RAG), balancing performance and contextual relevance. United Nations New York, NY
Software Engineer Intern May 2024 – Nov 2024
• Designed and developed the department’s flagship website, transforming Adobe XD mockups into a fully functional Drupal-based platform using PHP and Twig, enhancing accessibility for 30,000+ monthly visits.
• Developed backend APIs endpoints within Django and Wagtail in Python to enhance content accessibility for both external and internal use, enabling structured content export.
• Automated website migration to AWS with Python scripts, reducing manual work by 80% and integrated robust application deployment practices within a DevOps framework using Docker. Shanghai Institute of Corporate Culture and Brand Shanghai, CN Data Analyst Intern Jun 2023 – Aug 2023
• Designed a brand valuation framework using Structural Equation Modeling and Regression Analysis with Scikit- Learn, enhancing accuracy by 14% and generating a ranking for tech SMEs.
• Spearheaded automation of report consolidation and review processes for over 100 reports using Python, achieving an 87% reduction in task duration, thereby streamlining workflow, and significantly cutting operational costs.
• Presented findings to executive leadership, leading to revised brand strategy for tech SMEs. PROJECTS
AWS-based Natural Language Search Photo Album
• Developed a scalable AI-driven photo search engine using AWS Lex, OpenSearch, and Rekognition, enabling voice and text-based image retrieval with 96% accuracy.
• Automated image indexing and metadata extraction with AWS Lambda, integrating serverless pipelines to classify and store object, face, and scene data for real-time search in OpenSearch.
• Designed and deployed a RESTful API with API Gateway, optimizing request routing, authentication, and query processing to enable secure and scalable search operations. Contagious Disease Viral Vulnerability Analysis
• Integrated and analyzed eight large-scale datasets including demographics, health infrastructure, case trends using SQL, GeoPandas, Matplotlib, and SciPy.
• Developed a vulnerability scoring model incorporating advanced epidemiological metrics and demographic variables, boosting risk assessment accuracy by 20%.
• Created dynamic geographic heatmaps in GeoPandas to pinpoint high-risk regions, directly informing targeted public health interventions.