Post Job Free
Sign in

Machine Learning Data Engineer

Location:
Ithaca, NY
Posted:
April 07, 2025

Contact this candidate

Resume:

Ziyang Li

Phone: 347-***-**** • Email: ****************@*****.*** • LinkedIn URL: https://www.linkedin.com/in/ziyangli7181/

EDUCATION

Cornell University, College of Engineering, Ithaca, NY

Meng. in System Engineering

• School years: August 2024 – May 2025

Syracuse University, School of Information Studies, Syracuse, NY

B.S. in Applied Data Analytics

• School years: August 2020 – May 2024

• GPA: 3.511

SKILLS

• Programming Languages: Python, R, SQL

• Frameworks: Numpy, Pandas, Matplotlib, Pyspark, Scikit-learn, PyTorch, NLTK, Pyomo, Gurobipy, Selenium

• Softwares: Microsoft Word, PowerPoint, Excel, Wireshark, Adobe illustrator, Google Drive, Lucidchart, Vertabelo, Microsoft Project, Sysml

• Relevant Courses: Calculus, Statistics, Linear Algebra, Data Visualization, Data Mining, Database Management, Machine Learning, Deep Learning, Computer Vision, Computational Optimization

• Certification: CITI Human Subjects Research Certification, INCOSE certification (ASEP)

EXPERIENCE

CoScribe – Data Engineer intern

Manhattan, NY • 07/2024 – 08/2024

• Use selenium for data mining.

• Analyze Tiktok content creators by use LLM.

• Use LLM (GPT-4o) to generate the social media content.

DATAMIMO LLC - Machine Learning/Data Science intern

Palo Alto, CA • 09/2023 – 10/2023

• Conducted data visualization analysis for Airbnb's short-term rental projects in Hawaii and performed sentiment analysis on Airbnb. reviews using logistic regression.

• Leveraged machine learning methodologies to develop and validate a predictive model for assessing property values and determining optimal pricing strategies.

• Communicated model development and performance through comprehensive written reports and oral presentations to senior manager.

National Science Foundation & School of Information Studies, Syracuse University – NSF-Funded Research Fellow

Syracuse, NY • 05/2023 – 07/2023

• Analyze biomedical research conclusions using machine learning, deep learning, and other NLP technique.

• Contributed to the fine-tuning of GPT-3.5 through advanced prompt engineering techniques, optimizing its capacity to detect. actionable insights from biomedical research data.

Hogwarts Capital – Data Analyst intern

Garden City, NY • 06/2022 – 08/2022

• Data collection and analysis of financial statements of over 700 US-listed companies by using Python and Excel.

• Revenue and value analysis of real estate projects.

• Times Square billboards operation, revenue, and market research.

• Initial planning and concept research of metaverse project.

Spring Glory International Inc. – Financial Analyst intern

Manhattan, NY • 07/2021 – 08/2021

• Facilitated the translation, proofreading, and editing of Initial Public Offering (IPO) prospectuses for Chinese firms wishing to be listed in US exchanges, ensuring linguistic accuracy and compliance with financial industry terminology and regulations.

• Conducted a comprehensive review of IPO prospectuses, actively identifying and rectifying errors in content, financial disclosures, and formatting to guarantee the production of polished and legally compliant documents.

• Use Excel to organize the IPO company’s financial statements and calculate annual revenue growth and cash flow growth.

ACADEMIC PROJECTS

Detecting actionable recommendations from biomedical research conclusions – July 2023

• Test the ability of GPT-3.5 to identify actionable proposals.

• Examine the accuracy of NLP methods (SVM, BERT) for detecting actionable recommendations from health research conclusions.

Analysis of Key Indicators of Heart Disease – May 2023

• Used machine learning models (Logistic Regression, Decision Tree, and Random Forest) to detect the heart disease.

Analysis of DOHMH Childcare Center inspections data – December 2022

• Used Python to preprocess and analyze NYC childcare center violation status extracted from New York City Department of Health & Mental Hygiene open data.

•Visualized data and analyzed changes in childcare center violations in New York City.

Research of the probability of winning in e-sports competitions – December 2022

• Used R for data cleaning

• Used machine learning models (Decision Tree and Naive Bayes) and association rule in R to predict the probability of winning on e-sports competitions data.

Data visualization of 2010-2016 NYC school safety - April 2022

• Preprocessed and analyzed report from New York City Department of Education open database by using R.

• Used R to visualize the changes in crime status in NYC public schools and edit of these graphs by using adobe illustrator.



Contact this candidate