Post Job Free
Sign in

Data Analyst

Location:
Berkeley, CA
Posted:
December 05, 2024

Contact this candidate

Resume:

Sizhe (Thea) Tang

510-***-**** *********@********.*** www.linkedin.com/in/sizhe-tang

Berkeley, CA 94704

EDUCATION

University of California, Berkeley IEOR Department, Master of Analytics Berkeley, CA Aug 2024-May 2025 Courses: Machine Learning and Deep Learning, Optimization Analytics, Stochastic Optimization for Machine Learning Central University of Finance and Economics Bachelor of Science, Applied Statistics Beijing, China Sep 2020-Jun 2024 GPA: 3.97/4.00 School Ranking: 1/172

Courses: Multivariate Statistical Analysis, Numerical Analysis, Stochastic Process, Statistical Computation, Time Series Analysis Programming & Skills: Python, R, SQL, Gephi, SPSS, LaTeX, Excel, Statistical modeling, Machine Learning, Optimization, AB-test PROFESSIONAL EXPERIENCE

Shanghai Mingshi Private Fund Management Co., Ltd. Quantitative Research Intern Mar 2024- Jun 2024

Optimized portfolio allocation weights using a quadratic objective function based on stock expected returns, incorporating constraints such as risk, liquidity, style exposure and turnover rate; Lagrange multipliers and PyTorch were used

Utilized the conjugate gradient algorithm in PyTorch to compute portfolio allocation weights for real-time high-frequency stock trading, achieving a 34% improvement in computational efficiency Guosen Securities Co., Ltd. Quantitative Research Summer Intern May 2023-Aug 2023

Developed stock selection strategies using sentiment scores from over 1.8 million research summaries, mining 12 factors for investment strategy with an IC of 6.7% and an average return of 21%; fine-tuned BERT model and a half-life method are used

Extracted major event labels from company announcements and identified significant events based on closing prices; calculated time decay coefficients, mining 42 investment strategy factors with an average return of 17%; Matplotlib, Numpy were used Deloitte Consulting Co., Ltd. Data Scientist Summer Intern May 2022-Jul 2022

Issuing agency impact score: Used publication network analysis to stratify and score agencies based on eigenvector centrality

Policy content classification model: Fine-tuned BERT model for label classification tasks using predefined policy content labels, optimized the model performance on labeled policy data, achieving a validation accuracy of 83%

Industry impact identification: Identified affected industries by fuzzy matching and adjusted impact across CICS level by Python Tencent Technology Co., Ltd. Data Analyst Intern Jan 2022-Mar 2022

User churn risk analysis: Utilized user access log data and attribute data for comprehensive DAU retention analysis, evaluated new user retention risk by Kaplan-Meier estimation in Python, visualized retention stacked charts and churn curves in Matplotlib

User retention impact attribution: Used the Cox proportional hazards model to assess the impact and risk ratios of various factors on new user retention time; developed strategies based on the analysis, resulting in a 32% increase in the new user retention rate PUBLISHED RESEARCH & PROJECT EXPERIENCE

Technology Convergence Network Analysis and Moderation Modeling Second Author, Under Review to SSCI/SCI Journal

Constructed technology convergence network and visualized its dynamic evolution by mapping patent data to company locations

Calculated the eigenvector centrality and degree centrality for intra-domain and inter-domain technology convergence densities

Built regression model to evaluate the influence of both densities and the moderating effect of digital technology Infrastructure Investment Network Analysis and Uncertainty Index Modeling Second Author, Published on CSSCI Journal

Constructed a collaborative network using policies issued jointly by central and local infrastructure investment authorities

Selected 45 indicators from 5 areas, assigned expert-based weights to create infrastructure investment policy uncertainty index Big Data Mining and Sentiment Analysis of ChatGPT-themed Tweets

Crawled 1 million ChatGPT-related tweets, revealing the trend and user behavior based on time series and user behavior analysis

Selected the optimal number of topics and iterations of the LDA by consistency scores, mined 10 topics with word cloud maps

Compared VADER and roBERTa based on polarity and computational efficiency, sensitive analyzed 10 topics by roBERTa ESG Index Evaluation System with Regression and SVM Models Co-first Author, Published on EI Conference 2022 Mathematical Contest in Modeling Finalist (Top 1.9%) 3th National College Student Mathematics Competition Third-prize 2th China Undergraduate Mathematical Contest in Modeling Third-prize LEADERSHIP

Analytics Consulting Organization at Berkeley Vice President of Client Relationship

Reached out to 500 potential clients and Berkeley alumnis to provide additional job opportunities for Master of Analytics President of Student Union and Vice President of Student Associations, CUFE

Led the Student Union in organizing 10 campus-wide events, oversaw student committees and managed 60 student associations



Contact this candidate