Tiantian Li, Ph.D.
352-***-**** *************@*****.*** LinkedIn Google Scholar
SUMMARY
Data Scientist with expertise in machine learning and deep learning implementation, design of experiment, regression analysis, anomaly detection, time series analysis and their application in natural language processing (NLP), computer vision (CV)/image processing, marketing science, geospatial data analysis and mapping, chemical and materials science; Working knowledge of data warehousing, data migration and data visualization; Excelled in team-oriented cooperation and innovative thinking; Seeking career as a data scientist in Tech- oriented Industries.
TECHNICAL SKILLS
ML & Statistic Analysis Skills: Deep Learning Models (CNNs, RNNs, LSTMs, and Transformers), Natural Language Processing (NLP), Computer Vision/Image Processing, Reinforcement Learning, Recommender System, Regression Models, Decision Tree Models, Clustering Models, Time Series Models, Bayesian Statistics, A/B Testing, Data Visualization, Exploratory Data Analysis Programming Languages: Python (NumPy, PyTorch, TensorFlow, Keras, Scikit-learn, Matplotlib, OpenCV), R, SQL, JavaScript, HTML Big Data Platform: PySpark, Hadoop, Hive, MySQL, Google Cloud Platform (GCP) Statistical and Visualization tools: Tableau, Looker, Power BI, Advanced Excel, R studio, Minitab Geospatial software: ArcGIS desktop, ArcGIS Pro, eCognition, ENVI and QGI Online Courses and Certificates: IBM Data Scientist; Python for Data Science and AI; Databases and SQL for Data Science PROFESSIONAL EXPERIENCE
Data Scientist Machine Learning Researcher
Freelance, Groton, CT [ NLP Recommender Systems Deep Learning ] May. 2023-Present Sentiment Analysis for Marketing via NLP [ Word2Vec SMOTE DistilBERT ]
• Conducted data pre-processing pipeline such as converting reviews to embeddings via Word2Vec method and sampled from imbalanced datasets using the imbalanced-learn package such as SMOTE.
• Analyzed the reviews with a state-of-the-art deep learning technique, DistilBERT, using PyTorch, and optimized the algorithm to match the problem area and to perform well on the validation data.
• Evaluated DistilBERT on downstream task of predicting the positive sentiment assessment, with 66% recall rate, 60% precision rate and 62% f1 score, using sklearn packages.
Recommender Systems Implementation [ BPRMF ItemKNN DeepFM LightGCN ]
• Investigated and evaluated ML & DL algorithms on Amazon datasets containing 255,404 users and items for recommendation.
• Implemented and developed BPRMF, ItemKNN algorithms using Pytorch, and optimized the methods compared with two recommender algorithms (DeepFM, LightGCN), that gained an improvement of 17% in recall and 11% in Normalized Discounted Cumulative Gain (NDCG). Performed hyper-parameter validation with 10-fold cross-validation with recall and NDCG. Data Scientist Data Analyst
Avison Young, Boca Raton, FL [ NLP Data Engineering Tableau Marketing Research & CRM] Oct. 2022-Apr. 2023
• Developed a website Chatbot using JavaScript within the HubSpot platform to boost customer interaction and engagement.
• Implemented an automated ETL pipeline for textual data, streamlining data cleaning through NLP methodologies, such as tokenization, stopword removal, stemming, and spell checking, with Pandas, Regex and NLTK libraries.
• Conducted marketing analytics project that involved conducting customer data analysis and creating a Tableau-based dashboard, equipping the marketing/CRM team with the tools for data-driven decision-making. Data Scientist Research Associate
Florida Atlantic University, Boca Raton, FL [ Machine/Deep Learning Geo-information & Image processing ] Feb. 2020-Oct. 2022
• Performed comprehensive data migration, preprocessing and analysis across multiple data modalities, including satellite digital imagery, LiDAR 3D point data, time series sensor data.
• Built robust machine learning (SVM, Random Forest, Regression) and deep learning frameworks (CNN, RNN), achieving high accuracy metrics, such as R-squared values surpassing 0.8 for greenhouse gas emission prediction, MAE less than 1.22 feet for groundwater table elevation estimation, and classification accuracy exceeding 80.5% for vegetation classification.
• Visualized the scientific results as map products and contributed to scientific publications and reports. Data Scientist Research Assistant
University of Florida, Gainesville, FL [ Design of Experiment Statistical Analysis Data Visualization ] Aug. 2016-Nov. 2019
• Designed experiments and performed statistical analyses to assist fabricating the biobased sorbents.
• Utilized statistical analysis (ANOVA, regression, t-tests) to evaluate sorbent performance and optimize fabrication processes.
• Applied descriptive statistics for the analysis, interpretation, and visual presentation of experimental results through charts, graphs, and diagrams using Python Pandas and Matplotlib libraries. EDUCATION
University of Florida Gainesville, FL Aug. 2016-May 2020 Ph.D. in Applied Sciences Machine Learning GPA: 3.5/4.0 Awards: Grinter Award (2016, 2017, 2018) University of Florida Gainesville, FL Aug. 2013-Aug. 2015 M.S. in Applied Sciences Data Science GPA: 3.5/4.0 Awards: IFAS/CALS Graduate Student Travel Grant (2014); Graduate Student Council (GSC) Travel award (2014)
Northwest University Xi’an, China Sep. 2009-Jul. 2013 B.S. in Applied Sciences Statistics GIS GPA: 3.5/4.0