Post Job Free

Resume

Sign in

Machine Learning Data Scientist

Location:
Southlake, TX
Posted:
December 21, 2023

Contact this candidate

Resume:

*

Lucy Lu, Ph.D.

ad15a2@r.postjobfree.com, 608-***-****, Seattle, WA

SUMMARY OF SKILLS

• Five-year, full-time industry experience in machine learning and deep learning.

• Ten-plus-year experience in data science and analytics: root cause analysis, solving problems with scientific methods and data.

• Expert knowledge of data structures & programming languages: Python, Pytorch, tensorflow, R, Java, JavaScript, C/C++/C#, Kusto/SQL/Complex SQL/ PostgreSQL/ NoSQL, redis distributed database, distribute system.

• Data collection: using python to collect data from various sources such as webpages using automation framework such as Selenium and scrapy. Analyze and process collected data for model training

• Levering ChatGPT to train models.

• Working knowledge in: Linux, shell scripts, AWS storage, dynamic database, lambda functions and routine maintenance relevant to machine learning projects. WORK EXPERIENCE

Automation Data Collection Seattle, WA

Applied Data Scientist/Machine Learning

August, 2023- present

• PyTorch: Taking training in deep learning and AI. Working on projects with deep learning neural networks with PyTorch framework. Training NLP models for text analysis and visual analysis with different fine tuned parameters and designs to optimize efficiency and accuracy.

• Model training: Researching leveraging ChatGPT to train models using collected data.

• Working on automation data collection using Scrapy and Selenium, developing JavaScript and Python scripts to automate the process.

Cigna Seattle, WA

Applied Data Scientist/Machine Learning

April 18, 2022 – June 30, 2023

• Data Science: Working on updating census machine learning project based on ongoing needs. Updating the expanding the scripts in both Python and R. Designing, modifying, training, and updating models in Python for better customer segmentation for various situations as well as accommodating to different model stages. Deploying and monitoring updated models to AWS. Designing, expanding, and updating current business logics.

• AWS and GitHub: Maintaining the GitHub branch for census processing, deployed the updated project to AWS and routine relevant AWS management. Microsoft (Bing Maps Team) Bellevue, WA

Data Scientist contractor (AI/machine learning for Map Data) June 2021 – April 15, 2022

2

• Data Science: Creating Scope scripts & supporting C# extensions to collect metrics for map data of different sources; developed ETL pipelines to transfer data between different databases; writing java files to collect metrics for open street map data.

• Data Analysis: Creating BI Tabular data model on Azure analysis services server. Creating a set of measures and KPIs to generate reports based on the data model. The model is connected to Power BI Dashboard and then queried from there for presentation purposes. Also, the key performance indicators (KPIs) are important indicator for guiding and evaluating operation and marketing strategies.

• Scripting: Developing java classes that examine geometry of atlas files from open street map raw pbf.

• Outcome: The atlas check java class is merged into the open source project for atlas checks (GitHub). The Scope script will be included into future pipeline for automated computation/presentation. Microsoft (Team Andromedra) Sammamish, WA

Data Scientist contractor (AI/machine learning for Azure Log Data Analysis) November 2019 – June 2020

• Machine learning and deep learning: Creating NLP machine learning models to analyze and classify Azure failover cluster log files to detect anomalies and predict anomaly types. Based on customer analysis and segmentation, we identified types of anomalies to detect, which defined how to collect as well as process data from engineers.

• A/B Testing Experimental Design: Designing different sets of key features to describe the data sets and created A/B testing experiments to quickly collect insights and feedback to improve data models, assist machine learning processes, and guide the overall progress of the project.

• Scripting: Writing SQL server scripts and stored procedures to perform various data driven tasks; creating C# code for Azure functions of web based services; developing python scripts and pipelines for machine leaning models.

• Outcome: The models have been tested for Azure failover cluster log anomaly detection, root cause analysis, and reporting.

Microsoft (Bing Maps Team) Bellevue, WA

Data Scientist contractor (AI/machine learning for Maps Data Analysis) May 2019 – September 2019

• Data conflation with GeoProcessing and spatial analysis: Analyzing & comparing geometry and attributes of two commercial map databases for road overlapping analysis and map data merging.

• Scripting: Writing C# and Cosmos scripts for the above analyses.

• Outcome: The methods have been tested/evaluated for major markets (USA, UK, India, etc.) Bellevue City Hall (Modelling and Analysis Group) Bellevue, WA Data Scientist contractor (AI/machine learning, traffic and survey data analysis) August 2018 – May 2019

• Travel survey: Processing Travel survey data to analyze features and travel behaviors of the Bellevue, Kirkland, and Redmond (BKR) region and comparing them with those of the Puget Sound region.

• Transportation data forecasting: Forecasting population and household data for the BKR region by simulation implemented in Python, validated with real-world travel data. Department of Mathematical Sciences, Rensselaer Polytechnic Institute Troy, NY Research Assistant & Teaching Assistant (AI/machine learning, operations research) 3

August 2013 – May 2018

• Data analytics and machine learning: Market basket analysis to create a prototype recommendation system for the online store of a grocery store; Programmatic TV project to mine viewership data to predict the demographics of viewers and optimize TV ad schedules.

• Dissertation research on robust humanitarian logistics: Developing scalable logistic models and applied state of the art stochastic programming methods to treat uncertainties/stochasticity in humanitarian logistics; Building robust optimization models for optimal efficiency and reliability of delivering humanitarian aids to disaster-impacted areas to reduce fatality and human suffering; The models were tested on a transportation freeway network in the City of Seattle. CLR Analytics, Inc. Irvine, CA

Transportation Data Analysist (Contractor) July 2008 – September 2009

• Traffic data processing, mining, and analysis: Designing large-scale regression models to analyze the accident effects on key transportation infrastructure to gain insights for improving safety. TOPS (Traffic OPerations and Safety) Laboratory, University of Wisconsin Madison, WI Project Assistant May 2004 – June 2006

• Master’s Thesis Project on freeway travel time prediction: Building a data pipeline for extracting, stitching, and cleansing large-scale traffic sensor data; using the data to develop regression models for freeway travel-time prediction.

• Urban mobility Projects: Simulating real-world roadway traffic for alternative traffic management strategies; analyzing, evaluating, and optimizing traffic signal design and timing plans. EDUCATION

Ph.D. Department of Mathematical Sciences May 2018 Rensselaer Polytechnic Institute

Dissertation: Robust Models for After Disaster Delivery Schedules M.S. Transportation Engineering and City Planning

University of Wisconsin, Madison

M.A. Technical Communication

Texas Tech University

M.S. Linguistics and Applied Linguistics

Tsinghua University, P.R. China

B.S. Telecommunication Engineering

Beijing University of Posts and Telecommunications, P. R. China REFERENCES

Available upon request.



Contact this candidate