Post Job Free

Resume

Sign in

Data Scientist

Location:
San Francisco, CA
Posted:
November 17, 2020

Contact this candidate

Resume:

QINGYI (LEXIE) SUN

San Francisco, CA (Willing to relocate) 628-***-**** adhxhf@r.postjobfree.com LinkedIn GitHub WORK EXPERIENCE

Machine Learning Data Scientist, University of California San Francisco 10/2019 - Current

• Developed a GPU accelerated deep learning architecture to characterize diseases, make diagnosis and prognosis from brain magnetoencephalography (MEG) data, consisting of Recurrent Neural Network (LSTM) and Convolutional Neural Network in PyTorch. The outcome has resulted in a scientific conference presentation and a manuscript.

• Finetune the existing model on manual/machine separated groups of patients. Evaluated the effects of different methods of separating patients based on supervised and unsupervised learning algorithms including k-means, SVM, PCA, etc.

• Simulated data by Markov chain Monte Carlo (MCMC) method based on Python scripts and coded a pipeline for signal processing, transforming and cleaning.

Data Scientist Intern, WOMOW Technology 06/2018 - 12/2018

• Developed a deep learning product to automatically inspect power transmission equipment and locate damaged parts in images. Collaborated and explained analytics model results in the context of business to cross-functional teams.

• Streamed 300k photos from S3 bucket based on Python scripts. Designed, developed and tested image segmentation models (U-Net, R-CNN), achieved an accuracy of 0.93. Built a scalable data pipeline to augment the images and applied pseudo labeling techniques to better estimate hyper-parameters. Business Data Analyst Intern, China Construction Bank 11/2017 - 03/2018

• Designed, developed, and tested data-driven robust marketing strategies, increased sales by 9%. Conducted cluster analysis, regression, classification and ensemble methods (gradient boosting trees, etc) to segment customer, create personas and make better customer modeling.

• Wrote and optimized SQL queries to perform data extraction to fit the analytical requirements. Analyzed massive and highly complex data sets, performed data manipulation to ensure data quality, consistency, integrity with Pandas, NumPy.

• Produced Tableau reports and visualization deliverables about marketing strategies and outcomes, including bi-weekly presentations to the leadership and communications with variety of audiences. KEY PROJECTS

ML Algorithms Implementation From Scratch

• Implemented in Python, from scratch, Linear Regression, Logistic Regression, Naive Bayes, Decision Trees, Random Forest, K-means clustering, drop-column and permutation feature importance, etc. App Dev: Pick-up Game App [source code] [app link]

• Designed and developed with Flask a pick-up basketball game web application, deployed on AWS.

• Developed Python scripts to collect and aggregate data from requesting APIs and scraping web pages with BeautifulSoup.

• Performed front-end development for interactive web app to improve functionality and user experience using HTML, CSS, and JQuery.

SF Bike Sharing Prediction with Spark

• Forecasted the number of bikes needed in SF based on different time slots and locations, achieved a R2 of 0.958. Cleaned, transformed and aggregated data with PySpark, conducted analysis with Spark H2O and AutoML on AWS EMR. Multiple Nature Language Processing Projects

• Built an article recommendation engine embedded with GloVe and word2vec, launched web server on AWS.

• Implementation of term frequency, inverse document frequency (TFIDF) EDUCATION

University of San Francisco, M.S. in Data Science 06/2020 Southwestern University of Finance and Economics, B.Econ. in Finance 06/2019 SKILLS

• Programming: Python (Scikit-Learn, pandas, Numpy, SciPy, PyTorch, Keras, Tensorflow, Matplotlib, Seaborn, Spacy, NLTK), SQL, AWS (EC2, EMR, S3), PySpark, HTML, Matlab, R, CSS, Bash, No-SQL (MongoDB), Linux, APIs.

• Skills: Machine Learning, Deep Learning, ETL, Data Analysis, NLP, Times Series, A/B Testing, Probability/Statistics, Git/GitHub, Tableau, Excel, PowerPoint.



Contact this candidate