JONATHAN (JENG-SHIUAN) MA
Tel: 267-***-****
Email: ad3cyl@r.postjobfree.com GitHub: data-science-portfolio Education
Harrisburg University of Science and Technology Harrisburg, PA M.S. in Data Analytics Expected graduation: 04/2025 National Sun Yat-Sen University Kaohsiung, R.O.C (Taiwan) Bachelor of Business Administration Degree completion: 06/2022 Professional Experience
Harrisburg University Center for Innovation & Entrepreneurship Harrisburg, PA Machine Learning Engineer Intern 07/2023 - Present AI teaching assistant chatbot development and deployment
Designed and deployed a chatbot in the metaverse environment serving as an AI teaching assistant for elementary school students, employing the technique of retrieval augmented generation (RAG).
Orchestrated the development of a robust ETL pipeline, implementing periodically automated triggers for the Word2Vec embedding model, and seamlessly converted documents of teaching material into vectors.
Optimized the performance of the Llama-2 model within the specific domain of elementary school education through Parameter-efficient Fine-tuning (PEFT), addressing challenges related to limited data.
Implemented an intent recognition model with BERT Embeddings for accurate user prompt classification, and seamlessly integrated a Text-to-Speech (TTS) model into the chatbot, fostering a more immersive user experience in the metaverse.
Applied MLOps expertise to the pipeline for CI/CD, integrated Amazon SageMaker Model Monitor for performance monitoring, and executed Canary deployment for testing the new chatbot. National Sun Yat-Sen University Kaohsiung, R.O.C (Taiwan) Data Analyst Intern 03/2020 - 12/2021
User experience analysis of the school course registration system
Designed survey questionnaires using SurveyMonkey to collect user experience data while ensuring data quality and minimizing bias.
Utilized conjoint analysis to understand consumer preferences for course registration system features.
Collaborated with the development team closely to enhance the course registration system based on data-driven insights. STP analysis of school anniversary souvenirs marketing strategy
Leveraged historical data for EDA to have an overall understanding of the sales revenue distribution.
Performed customer segmentation using clustering methods to enable targeted strategies.
Employed Power BI to construct dashboards, effectively conveying insights to non-technical stakeholders through presentations. Library database management and data analysis
Developed an ER diagram model for the new library database and successfully achieved database normalization.
Executed a data transfer process from the old database to the new one, while ensuring data integrity.
Performed hypothesis testing and utilized data visualization to support the book-purchasing strategy. Projects 07/2022 - 07/2023
Project: AI Music Generation
Implemented sequence-to-sequence data processing and trained an LSTM neural network to generate chorales.
Utilized advanced techniques for music composition with WaveNet, exploring generative AI applications. Project: Dog Breed Image Classification
Trained an Xception-based image classification model with a testing accuracy of 84%. Adopted transfer learning paradigm and data augmentation techniques to prevent over-fitting on a limited number of images.
Deployed the model on GitHub using Streamlit, making it accessible to a wide audience. Project: NBA 2023 Champion Prediction
Performed web scraping from the NBA website and preprocessed data to a structured format.
Conducted feature engineering and trained an XGBoost regression model to predict the NBA 2023 Champion, successfully predicted the Denver Nuggets championship.
T-brain AI Competition
Achieved a top-10 placement out of 1,000 teams in a Taiwan real estate price prediction machine-learning competition. Skills
• Programming Languages: Python, R, SQL, JavaScript
• Machine Learning Libraries & Frameworks: Scikit-Learn, PyTorch, TensorFlow, Amazon SageMaker, Apache Spark, Docker
• Data Visualization Tools: Tableau, Power BI, Excel