Minglei Cai
******@**********.***
EDUCATION
Georgetown University Sep 2021 - May 2023
Master of Science, Data Science and Analytics GPA: 3.63/4.0 Washington DC, US University of Science and Technology of China (THE World #74) Sep 2017 - May 2021 Bachelor of Science, Mathematics (Major), Computer Science (Minor) Hefei, China SKILLS
Programming: Python, SQL, R, JavaScript/TypeScript, C++, Java, C#, MATLAB Cloud Computing / ML: Kafka, Spark, Hadoop, Hive, HBase, Torch, Scikit-Learn, Tensorflow, NLTK, spaCy, NumPy Data Visualization: Tableau, PowerBI, D3.js, Matplotlib, Seaborn, Plotly, Altair, ggplot2, R Shiny, Datashader Frontend / Backend / Database: Angular, Spring, Flask, MySQL, PostgreSQL, MongoDB CERTIFICATIONS
AWS Certified Cloud Practitioner Sep 2023
AWS Certified Solutions Architect – Associate Oct 2023 WORK EXPERIENCE
Georgetown University Aug 2021 - Dec 2022
Teaching Assistant / IT Support Washington DC, US
Provided maintenance for educational technologies and resolved software/device issues.
Graded homework, conducted Q&A sessions and addressed coding/math problems raised by students. Hefei Zhongke Leinao Tech Co. Feb 2020 - Feb 2021
Data Engineer Intern Hefei, China
Implemented real-time data pipelines for data collected from sensors in the refrigerator testing processes, using Kafka for data ingestion, Spark for data processing and modeling, FineBI for data visualization.
Implemented anomaly detection modules for the monitoring machines, increasing the precision by 22%.
Worked closely with infrastructure engineers on the protocol integration, optimized and redesigned the database schemas, decreasing workloads by 21% and reducing the database capacities by 31%.
Developed Python Plotly scripts to generate summary reports for the manufacturing production lines. PHIMA Intelligence Tech Co. Nov 2019 - Jan 2020
Data Engineer Intern Maanshan, China
Optimized ETL processes using Spark, reducing the time complexity by 35%.
Built real-time video streaming pipelines for surveillance cameras using Alibaba Cloud Link Vision Video and implemented image preprocessing modules using Tensorflow.
Built the quarterly reports from the company data warehouse using Tableau. COURSE PROJECTS
Big Data – Impact of Russia-Ukraine Conflict to Commodity Markets Nov 2022
Applied LDA and sentiment analysis techniques to extract topic and sentiment-related features from over 10 million Russia-Ukraine-Conflict related comments on Reddit using Spark on Azure Databricks.
Built predictive models of XGBoost, Random Forest, and Feed-Forward Neural Networks for commodity prices and achieved an R-Squared value of over 0.2 for the price of natural gas. Visualization – What Makes a Great Buffet Restaurant in Florida Apr 2022
Conducted aspect-based sentiment analysis and identified 3 significant factors for positive reviews.
Created interactive plots with D3.js and Plotly to visualize quantitative insights from 1M+ Yelp reviews. NLP – A Conversational Chatbot Nov 2022
Utilized PyTorch and Transformers to fine-tune DialoGPT (a transformer-based language model).
Deployed the chatbot via the Google Voice API, ensuring convenient use through messaging for entertainment. Generative AI – Social Media AI Assistance Apr 2023
Utilized Diffusers, Transformers to incorporate ChatGPT and Stable Diffusion and constructed pipelines of prompt-based image generation, image captioning and text generation.
Developed a web application with FastAPI and TypeScript, Angular, enhancing artistic creation for graphic design and boosting the efficiency of mass-producing image assets.