Xiaolei (“Audrey”) Wang, Ph.D.
#*** - *** *********** *****, ***** College PA, 16803
******.*******@*****.***; 605-***-****
Summary
A dedicated analytical scientist with over 10 years of industry and academic experience in big volume data, predictive modeling, data mining and numerical analysis. Major strengths include hands-on computational programming skills, deep knowledge of mathematical modeling and computer algorithms, and scientific problem solving. Successfully accomplished the VHF Ocean Radar real time observation software, Tsunami wave simulation software, global land cover datasets of geospatial time series, and remote sensing detection analysis of droughts in Amazon forests. Over the last several years, focused on advanced statistical methods and predictive models based on generalized linear/logistic regression and machine learning techniques, and successfully developed a statistical approach to the best statistical model selection for the USDA’s pioneer project. Strong communication skills, having presented my research at 20 journals and conferences.
Selected accomplishments
Developed advanced predictive statistical models to predict the probability of forage species occurrence and abundance over the Northeast United States for the USDA’s pioneer project.
Successfully demonstrated that climate variables with fine scale topographic modification played the most important role in species abundance distributions.
Designed, coded and shared programming code with R and SQL: extracting data from large relational database and sampling data from geospatial images; organizing the data; exploratory data analysis; evaluating predictive performance of the models; visualization.
Performed anomaly detection on real time series of satellite images for the impact of a drought event on Amazon rainforests; identified the main factor of forest degradation.
Established and validated regression models for rangeland phonological prediction using satellite vegetation observations and climate datasets.
Accomplished a computer code for predicting CO2 concentration based on partial least square regression using hyperspectral satellite images.
Performed multivariate cluster analysis (k-means and fuzzy c-means) to classify boreal, temperate and tropical tree subdivisions.
Successfully achieved the VHF ocean radar system: responsible for modeling, algorithms, time series data analysis and programming for extracting ocean currents and waves from Doppler spectrum of sea echo (currently under implementation).
Developed advanced Tsunami simulation software for business consulting.
Estimated the time series of global datasets of functional plant types (PFTs) for use in CGC3M (Canadian Global Coupled Carbon Climate Model); shared the datasets with other project members in the world.
Simulated time series datasets of land cover based on satellite images for a coupled land-atmospheric model.
Improved programming code to access global tiles of satellite time series and for data processing in the laboratory.
Designed and coded a computational program for an atmospheric correction model.
Calibrated and evaluated the algorithms for monitoring ocean surface temperature, turbidity and aquatic plants using satellite image data (a part of NASA’s earth science project).
Quantified the role of land use change in ecosystem function based on Terrestrial Ecosystem Model (TEM).
Conducted data mining term for interpreting and predicting transportation population trends.
Trained staff in computer simulation and programming.
Data Modeling and Analysis Skills
A wide range of data modeling and analysis capabilities:
Predictive modeling; regression models: linear regression, logistic regression, Generalized Linear Model, Generalized Additive Model; machine learning techniques: Random Forests, Boosting, Decision Trees, Artificial Neural Networks, Support Vector Machine; K-means clustering, Fuzzy C-means clustering; Principle Component Analysis; anomaly detection, time series, visualization.
Proficient in programming: R, SPSS, SAS; SQL, PostgreSQL and MySQL; UNIX/LINUX and Windows.
Proficient in spatial analysis tools and imaging software: ArcGIS, GRASS and ENVI.
Employment Experience
Postdoctoral Researcher, Pennsylvania State University, State College PA, 2013 – present.
Postdoctoral Researcher Associate, Lehigh University, Bethlehem PA, 2012 – 2013.
Postdoctoral Researcher Associate, University of Utah, Salt Lake City UT, 2011 – 2012.
Postdoctoral Researcher Associate, University of Oklahoma, Norman OK, 2009 – 2011.
Postdoctoral Researcher, South Dakota State University, Brookings SD, 2008 – 2009.
Research Associate, University of Alberta, Edmonton, Canada, 2002 – 2007.
Project Leader and Senior Engineering Scientist, Japan Railway Souken Information System Inc., Tokyo, Japan, 1999 – 2001.
Engineering Scientist, Ocean High Technology Institute Inc., Tokyo, Japan, 1997 – 1999.
Engineering Scientist, Kokusai Kogyo Corporation Ltd., Tokyo, Japan, 1993 – 1997.
Education
Ph. D. in Physics, Tohoku University, Japan
Master of Science in Physics, Tohoku University, Japan
Bachelor of Science in Physics, Tohoku University, Japan
Certifications
Stanford University online course “Statistical Learning” 2014
Instructors: Trevor Hastie and Rob Tibshirani; completed with a score of 96%.
Stanford University online course “Statistics in Medicine” 2014
Instructor: Kristin Sainani; completed with distinction.