WONHA SHIN
Phone: +1-551-***-**** Email: ******@**.*********.*** Portfolio: leahnote01.github.io/ LinkedIn: linkedin.com/in/wshin7 EDUCATION
University of Rochester Rochester, New York
Master of Data Science GPA 3.43 / 4.0 Aug 2023 – Dec 2024 Pusan National University Busan, South Korea
Bachelor of Arts in International Business and Economics GPA 3.73 / 4.0 (Cum Laude) Mar 2011 – Aug 2018 PROFESSIONAL EXPERIENCE
VOESH New York South Plainfield, New Jersey
Administrative Accounting Specialist, Admin/Accounting Oct 2018 – Mar 2023
● Built a predictive time-series framework in Oracle DB, boosting sales forecasting accuracy by 15% across 10K+ clients. Performed sales trend modeling to drive strategic planning and inventory optimization. Partnered with internal and external network cross-functional teams to streamline reporting and support data-driven decisions. EASTERN AMERICAN CDC (Non-Profit Federal Community Bank) Englewood, New Jersey Credit Analyst Intern, Credit Assessment May 2018 – Oct 2018
● Conducted quantitative credit analysis for 100+ SBA 504 loan applications, applying statistical and financial modeling techniques to assess repayment capacity and risk profiles. Performed trend and variance analysis on loan performance data, helping identify high-risk applicants and support risk-adjusted funding decisions. PROJECTS & SKILLS
● LLM-Based Real-Time Translation System with Kafka and MLOps Automation: Spearheaded end-to-end real-time multilingual translation system as a solo architect & engineer, integrating Kafka (streaming), FastAPI (async microservices), and Hugging Face Transformers for low-latency neural machine translation. Integrated Prometheus & Grafana dashboards for system health monitoring and W&B for drift detection, retraining, and model versioning. Achieved 0.001s translation latency and validated scalability through load testing of 10K+ concurrent messages.
● 200 Days Learning Posting Challenge (’24–’25): Completed a 200-day deep learning and MLOps-focused study challenge, mastering foundational and advanced topics in ML, AI, and Deep Learning—from regression, PCA, decision trees, CNNs, transformers, and LLMs. Explored MLOps practices such as reproducibility, monitoring, and deployment. Documented daily with technical write-ups on GitHub and LinkedIn, engaging the broader AI/ML community.
● E-Cigarette Perception Capstone Project (Chief Operations Manager): Directed a full-cycle NLP project analyzing 100K+ multilingual social media posts to uncover public perceptions of e-cigarette use and support public health initiatives. Led team coordination, modeling, MLOps planning, and data governance, applying sentiment analysis and LDA topic modeling to identify health risks, misinformation, and demographic trends. Preparing for publication.
● Microsoft Azure MLOps Pipeline Implementation: Engineered a CI/CD pipeline on Azure using GitHub Actions, Docker, ACR, and ACI to automate the build, test, and deployment of a containerized FastAPI application. Designed and deployed an end-to-end MLOps workflow with Azure ML, integrating model training, versioning, and automated release pipelines for continuous model delivery. Strengthened monitoring of key metrics to ensure secure production-ready ML.
● Real-Time Tweet Sentiment Analysis Pipeline : Built a real-time sentiment analysis pipeline with real-world using PySpark, Databricks, and AWS, streaming tweets from S3, classifying them with a pretrained transformer, and storing results in Delta Lake. Implemented structured streaming with windowed triggers for reliable, low-latency processing.
● Data Mining: Cross-Cultural NLP Analysis of Luxury Hotel Reviews in Europe- LDA Topic Modeling : Led analysis of 515K+ hotel reviews using LDA topic modeling to reveal cultural differences in traveler experiences across Europe. Leveraged scikit-learn pipelines to uncover trends, improving market insights for hospitality and tourism analysis.
● Skills: Infrastructure & Automation: Ansible (server provisioning & configuration), Docker, Kubernetes, GitHub Actions, Terraform (basic), Data Engineering & Streaming: Apache Kafka, PySpark, Databricks, Delta Lake, Airflow MLops & Monitoring: Azure ML, MLflow, Prometheus, Grafana, Weights & Biases, CI/CD pipelines, ML & AI Frameworks: PyTorch, TensorFlow, Hugging Face Transformers, scikit-learn, Keras, Clouds: AWS (S3, EC2, Lambda, ECR), Microsoft Azure(ACI, ACR, Azure ML) Databases & Data Management: PostgreSQL, SQL Analytics, ETL pipeline development, data cleaning & integration, Python (NumPy, Pandas, FastAPI), R (RStudio), Git (Version Control)