Post Job Free
Sign in

Data Analyst

Location:
Idaho Falls, ID
Posted:
March 31, 2023

Contact this candidate

Resume:

Yalei Tang

Idaho Falls, ID 402-***-**** ******@*******.***.*** LinkedIn Google Scholar

Education

Master of Mathematics-Statistics GPA:3.93/4 Sep 2019 – Aug 2021 University of Nebraska-Omaha Omaha, NE

Ph.D. Candidate in Structural Engineering GPA: 3.83/4 Jul 2023 (expected) University of Nebraska-Lincoln Lincoln, NE

Skills

Data visualization: Power BI, Tableau

Programming Language: MySQL, Python, Matlab, R, C/C++ Environment: Azure DevOps, Visual Studio

Courses: Data Science, Data Visualization, Machine Learning and Data Mining, Probability, Stochastic Process Experiences

National Indemnity Company, Berkshire Hathaway Group Omaha, NE Pricing Analyst Intern Jun 2022 – Aug 2022

• Facilitated multi-database development for a new insurance rating platform and modernize antiquated business applications.

• Retrieved and tested insurance quotes code using the Swallow API.

• ETL via SQL: Populated rate factor tables by manipulating SQL data objects and stored procedures.

• Source control: Collaborated with cross-functional teams to launch insurance rating product.

• Modified C++ source code and pulled requests from a Git repository into Visual Studio. Merged new features to Azure DevOps branches.

• Pursued professional development by enrolling in courses, including Power BI and Tableau. U.S. Department of Energy University Program Idaho Falls, ID Omaha, NE Research Assistant Data Analyst Aug 2017- Present

• Led research efforts as principal investigator analyzing experimental data.

• Published [paper1] [paper2] [paper3] by contributing to data analysis of experimental/research data. Projects

Kaggle competition: Customer Behavior Prediction May 2022

• Predicted whether a customer will place an order after visiting the website of an online shop using machine learning algorithms (GLM, random forest, C5.0, xgboost).

• Improved model accuracy from 95% to 97% by data imputation, parameter tuning and ensembled independent machine learning models.

• Ranked 2/23 in the competition.

DrivenData competition: Predict Reconfigurations at US Airports hosted by NASA Apr 2022

• Predicted airport configuration changes from real-time data sources including air traffic and weather using machine learning algorithms (Random Forest) using Python.

• Won 3rd place in the competition.

Kaggle competition: Microsoft Malware Prediction Mar 2022

• Cleaned and prepared datasets by removing irrelevant variables, replace NA values, factorized Boolean variables, converted categorical variables.

• Developed GLM, XGBoost, Random Forest models to predict a Windows machine’s probability of getting infected by various families of malware, based on different properties of that machine. Cluster Based Noise Removal in Long Term Monitoring of Damaged Concrete Aug 2019 - Aug 2021

• Prepared data using uni-variate filters to avoid distortion in clustering analysis due to erroneous data.

• Optimized the AE data suppression by developing multi-variate filters using K-means clustering analysis to recognize inherent patterns of the data based on their high dimensional similarity.

• Translated the results using radar plots visualization to differentiate noisy subgroups.

• Cross-validated the results with complementary experimental measurements to confirm the model accuracy.

• Automated the data filtering procedure using MATLAB and RStudio to improve the data processing efficiency.

Massive Acoustic Emission (AE) Monitoring Data Pre-processing Aug 2017 - Aug 2019

• Exported and stored large AE dataset by developing Matlab scripts.

• Visualized the dataset by developing a user interface using Shiny to analyze data quality and consistency.

• Cleaned the dataset experimenting with linear regression (glm), classification (lda,qda), tree-based methods

(tree, rpart, c50, xgboost, deepboost, boosting, random forest, bagging), support vector machine (svm) in RStudio to remove outliers/noises.

Other Experiences

Graduate Research Assistant Aug 2017 – Present

• Collaborated as a co-author in multiple journal papers, including Structural Health Monitoring, by providing critical contributions to data analysis. This involved leveraging expertise in statistical modeling and data visualization to help shape the analysis and presentation of research findings. The resulting papers were well- received by peers in the field and contributed to advancements in the understanding of key topics in the field.

• Led research efforts as principal investigator analyzing experimental data for a U.S. Department of Energy project. This involved utilizing advanced statistical techniques and data analysis software to interpret complex data sets, resulting in the identification of key insights and trends that were crucial for the project's success. The resulting analysis was instrumental in informing decision-making and shaping future directions of the project.

• Developed a novel framework to enhance engineering monitoring by proposing machine learning models for data analysis. This involved utilizing Python to design and implement a comprehensive solution that improved accuracy of data classification. The resulting framework was successfully integrated into the project's engineering monitoring system, resulting in improved overall efficiency and performance.

• Conducted comprehensive cross-validation of data analysis results by integrating complementary experimental measurements to confirm model accuracy. This involved utilizing advanced statistical techniques and modeling software to compare and contrast data sets, resulting in the identification of discrepancies and potential areas for improvement. The resulting analysis was instrumental in enhancing the accuracy and robustness of data analysis and helped to ensure the validity of the overall research project. Graduate Teaching Assistant Aug 2021 - Dec 2022

• Tutor approximately 50 students in Mechanics by guiding students through practice problems to improve concepts comprehension.

• Track students’ progress by grading homework, developing recitation materials and providing tutorials.

• Advance effective teaching practices by taking CIRTL@UNL certificate.



Contact this candidate