Post Job Free

Resume

Sign in

Data Analyst

Location:
West Lafayette, IN
Posted:
January 20, 2021

Contact this candidate

Resume:

YI-HUI(SARAH) LU

**** *********** ***** *** ***., West Lafayette, IN 47906

Phone: 765-***-**** Email: adjklj@r.postjobfree.com Linkedin: yh-lu EDUCATION

Purdue University West Lafayette, IN

M.S. in Computer Science & Statistics Aug. 2019 - May 2021(Expected)

Coursework: Algorithm Design, Analysis and Implementation, Statistical Machine Learning, Natural Language Processing, Data Mining, Database System, Design of Experiments, Advanced Statistical Methodology. National Tsing Hua University Hsinchu, Taiwan

B.S. in Quantitative Finance Sep. 2014 - Jun. 2018

Coursework: Data Structures, Data Science: statistical learning, Programming in C, Linear Models, Mathematical Statistics I&II, Statistical Data Analysis, Linear Algebra

SKILLS

Programming Languages Python, R, SQL, MySQL, C, MATLAB, Java, ArangoDB, OrientDB, Azure CosmosDB,SAS Toolkits/Frameworks Scikit-Learn, TensorFlow, Quantitative developer, PyTorch, Scrapy, Numpy, Pandas, Seaborn, AWS, Machine Learning, Tex

Languages Fluent in Mandarin(Chinese), English and Taiwanese, Basic in Japanese EXPERIENCE

Data Science Consultant Jan. 2020 - Present

Purdue University West Lafayette, IN

Assisted clients in experimental design, data collection, and analysis with machine learning methods by python.

Explored the trend of protein density changes over time while controlling the false discovery rate. Data Scientist Jul. 2018 - Jun. 2019

Academia Sinica Taipei, Taiwan

Retrieved 10k+ data and performed hypothesis testing, clustered data through K-MEANS, and data visualization by R and python to drawn insight into the database on GEMTEE model.

Implemented and experimented with several latest Data Mining papers including Temporal Convolutional Network, Syn- tactic Packing and Implicit Sentiment in TensorFlow to verify their e ectiveness.

Implemented forecasting solutions using algorithms including statistical models, SVM and RNN. Utilized SVM to predict returning recall with 89% accuracy. Extensively worked on other machine learning libraries such as Seaborn, SciKit learn. Data Analyst Intern Jul. 2017 - Aug. 2017

Shanghai Futures Exchange Shanghai, China

Time-series data clean up, performed Time-Series Clustering and aspects of Feature Engineering to extract value from data and improved the robustness of the predictive model and launched A/B testing to design models with Python.

Low-quality data preprocessing, including mixed frequencies, missing values and pattern inconsistency, using Python to analyze large data sets to identify behavior trends among users using quantitative and statistical techniques and visualize collected data with Python.

Built SQL query structure on MySQL server database to sort, lter and retrieve data SELECTED PROJECTS

Sentiment analysis of coronavirus pandemic[Python] Embedded texts data with word2vec, GloVe and trained RNN, CNN model, Random Forest, SVM, Naive Bayse model and ne-tuned models through tensor ow to classify and launch sentiment analysis, achieved 85% accuracy. Spam SMS Filter:[Python, Scikit-Learn]

Implemented NLP, Naive Bayes classi er and LSTM with Mini-batch gradient descent to develop a spam lter with over 93% accuracy and recall. Performing Feature Selection, Linear Regression, Logistic Regression, SVM and Gradient Descent, Neural Network algorithms to train and test the huge data sets.



Contact this candidate