Post Job Free
Sign in

Data Analyst

Location:
Lanham, MD
Posted:
January 24, 2020

Contact this candidate

Resume:

Hanran “Horace” Weng

240-***-**** ● ******.****@*******.***.*** ●www.linkedin.com/in/hanran-weng-353021172 EDUCATION

University of Maryland, Robert H. Smith School of Business College Park, MD, USA Master of Science in Business Analytics (STEM), GPA: 4.0/4.0 08/2018 – 12/2019

● Courses: Data Mining and Predictive Analytics, Data Processing and Analysis, Big Data and AI etc.

● Online Data Engineering Bootcamp: Sorting Algorithm, Graph Search Algorithm, Tree, A/B Testing, Recommendation System etc. South China Normal University, School of Economics and Management Guangzhou, Guangdong, China Bachelor of Economics, International Business 09/2013 – 06/2018

● Student Minister in Student Union

● Exchange Program in Kansas State University from 08/2015 to 06/2016 TECHNICAL SKILL

● Data Analysis: Python, R, Tableau, EDA, Google Analytics, Simulation etc.

● Database Management: RDBMS, SQL, Hive, Pig, Spark, Hadoop.

● Machine Learning Model: Regression Model, K-nearest neighbors, Clustering, Regularization, Dimensional Reduction, Random Forest, Ensemble Methods, Deep Learning, NLP Techniques etc. WORK EXPERIENCE

Principal Financial Group Maryland, USA

Data Scientist 09/2019 – 12/2019

● Extracted 12GB data from database using SQL and applied aggregation methods to transform into a smaller dataset.

● Found the periodic pattern (6 weeks is a cycle) in the appearance of text features via EDA and included it in feature engineering.

● Added lag variables to capture the changes in market regimes label. Used SHAP values to select and combine features, reduced the number of features from 313 to 30, increased the accuracy by 1.5%.

● Built pipeline to generate dataset, test model, plot results and compare performances in the combination of data and models. University of Maryland, Center for Health Information & Decision Systems (CHIDS) Maryland, USA Graduate Research Assistant 08/2019 – 12/2019

● Played an important role in NLP on clinical notes in the project with Medstar Health. Utilized rule-based and/or machine learning NLP techniques to extract variables and useful information from unstructured data for prediction.

● Presented reviews to Medstar Health for them to have an in-depth understanding about current applications of certain NLP techniques in health care industry and for the cooperation.

● Accessed data, implemented data mining method to determine the correct datasets and variables by exploratory data analysis.

● Built API for more than 10 Java-based tools that used to extract unstructured data from clinical notes for future implementation. Ping An Technology Co., LTD. Shanghai, China

Data Scientist 05/2019 – 09/2019

● Built and improving models on anomaly detection based on time series server’s data via python. Improved the precision score from 87% to 96% and recall score from 90% to 95%.

● Detected abnormal data points using basic algorithms and rule-based approach for specific metrics.

● Visualized the time series data to identify patterns, trends and the distribution status of each metrics for intuitive understanding.

● Investigated new algorithms and tested conceptual models for AI interview technique by combining quantum physics concept and NLP techniques. The model had 82% accuracy in 30TB data and shortened the classify time from 10 to 6 seconds. HuaAn Funds Co., Ltd. Guangzhou, China

Data Analyst 06/2017 – 12/2017

● Analyzed statistics of sales and feeding results back to relevant departments and taking charge of construction and maintenance of 3000 customer information management database via SQL.

● Scratched data of more than 10 companies from public documents and uploaded them to the system for analyzing each week.

● Processed data via python and SQL to get the sales information, evaluated risk and feasibility of projects and wrote feasibility reports which brought the total sales in November up for 1.13%. PROJECT EXPERIENCE

Washington Metropolitan Area Transit Authority—Metro Delay Time Prediction (via Python & Machine Learning)

● Reached a robust model that shortens the predicted delay time from 5 minutes to 1.5 minutes as a four members’ team leader.

● Scraped and organized the data from 1000+ websites included varies types of accidents and corresponding details for analysis.

● Tuned parameters of different models along with data collected from weather service or other bureaus in order to lower MAE.

"One Step to UMD"—Building RDBMS for House Agency (via SQL and Tableau)

● Collected user story and requirements, designed and developed ER diagram and schemas, built a customer information management system on Microsoft SQL Server for startup house agency within the 8-week window and presented the product to a group of 30+.

● Aimed to help tenants look for a house and found roommates with common preferences and needs via SQL.

● Visualized the matching results via Tableau to give the housing agency more intuitive understanding on implementation.



Contact this candidate