Fengfeng Sun
***** ******** ****** *** ***, Bethesda, MD 20814
951-***-**** *********@*****.***
EDUCATION
2023.07-2024.10, Arizona State University Tempe, AZ, USA Master’s degree in computer science. GPA: 4.0
2021.09-2021.12, University of California Riverside Riverside, CA, USA Undergraduate course - CS 141 Intermediate Data Structures and Algorithms, grade 2011.09-2013.06, Soochow University Suzhou, China
Master’s degree in Textile Engineering. GPA:3.6
2007.09-2011.06, Soochow University Suzhou, China
Bachelor’s degree in Apparel Design and Engineering. GPA:3.49 CERTIFICATIONS
• Google Advanced Data Analytics, May 2025
• Big Data Professional Certification, Arizona State University, Mar 2024
• Software Engineering Professional, Arizona State University, May 2022 SKILLS
Programming & Analysis: SQL, Python, Java, C++, JavaScript, Scala Libraries & Tools: NumPy, Pandas, Scikit-learn, Matplotlib, Seaborn, TensorFlow, PyTorch, Apache Spark
Data Visualization: Tableau, Power BI
Web Development: HTML, CSS
PROJECTS & TRAINING EXPERIENCE
Taxi Cab fares prediction
Tools: Python(Numpy, Pandas, Scikit-learn, Matplotlib, Seaborn),Tableau
• Performed exploratory data analysis (EDA) on 20,000+ taxi fare records to uncover trends, anomalies, and fare-related patterns.
• Built and evaluated a linear regression model to predict taxi fares. Classifying a TikTok video as a claim or opinion
Tools: Python(Numpy, Pandas, Scikit-learn, Matplotlib, Seaborn),Tableau, XGBoost
• Conducted exploratory data analysis (EDA ) on 20,000 TikTok video listings to uncover key patterns and insights.
• Built and evaluated Random Forest and XGBoost models to predict if the TikTok video is a claim or opinion, comparing performance to identify the more accurate predictor. Creating Database Using Microsoft SQL Server
Tools: Microsoft SQL Server
• Designed and implemented a relational database in Microsoft SQL Server to store and manage data from six CSV files.
• Performed data cleaning and standardization to ensure consistency and integrity across related tables.
• Created SQL views to join and present related tables, enabling streamlined access for future data analysis and reporting.
Hot Spot Analysis on NYC Taxi Trip Data
Tools: Apache Spark, Scala, Spark SQL
• Configured the Apache Spark environment and developed Scala code using Spark SQL to interact with a relational database.
• Executed complex queries to extract spatial and temporal data from the NYC Taxi Trip dataset.
• Calculated the "hotness" of geographic rectangles by computing the Getis-Ord Gi* statistic.
• Identified the top 50 spatial hot spots based on descending G-scores to detect areas of high taxi activity.