Post Job Free

Resume

Sign in

Data Engineer

Location:
New York, NY
Salary:
80000
Posted:
May 01, 2018

Contact this candidate

Resume:

Eric (Binqian) Zeng

* ********** ****, *** ****, NY, 10009; Open to Relocate

Æ +1-929-***-**** Q ac5a2n@r.postjobfree.com

Education https://www.linkedin.com/in/binqian-zeng-257903126/

+

New York University, Courant Institute of Mathematical Sciences New York, NY M.S Data Science; GPA: 3.3/4.0 Sep 2016–May 2018 (Expected) Relevant Coursework: Machine Learning, Natural Language Processing(Kyunghyun Cho), Deep Learning(Yann LeCun), Statistical and Mathematical Methods, Big Data, Advanced Python, Decision Model and Analytics, Data Science in Quantitative Finance

+

Sun Yat-sen University, School of Engineering Guangzhou, China B.E Theoretical and Applied Mechanics (Fluid Dynamics Focus); GPA: 3.7/4.0 Sep 2012–Jun 2016 Honor: Third-class scholarship (three times)

Relevant Course: Computational Methods, Methods of Mathematical Physics, Optimization and Computational Linear Algebra, Ordinary Differential Equations,

Technical Skills & Certificates

• Programming & Scripting Language: Python, R/Matlab, Java, Fortran, Scala

• Toolkits, Softwares & Operating Systems: Tensorflow, Pytorch, Keras, NLTK, Scikit-learn, Hadoop, MapReduce, Spark, MySQL, MongoDB, AWS(EC2, S3), Tableau, D3.js, Excel, Github, Linux/Unix

• Certificates: Bloomberg Market Concept(BMC); Preparing for CFA Level I Exam - June 2018 Work Experience

+

Crypto Investments New York, NY

Machine Learning Engineer Intern Sep 2017–Dec 2017

- Scrapped reports, price, and volume data of 8 kinds of cryptocurrencies from 20 websites with BeautifulSoup

- Constructed data sets from scrapping with MongoDB; built a dashboard to visualize price and volume with Matplotlib

- Performed sentiment analysis model with FastText

- Constructed a hybridization of time-series analysis neural network for technical trade including ARIMA and Deep Belief Network

+

IBM Armonk, NY

Data Science Intern in Chief Data Office May 2017– Sep 2017

- Participated in constructing a pipeline to automatically extract metadata from unstructured documents

- Built Named-Entity Recognition model with Linear SVM; achieved an accuracy of 94%, which is competitive with Watson Natural Language Classifier’s accuracy of 97% under 70% coverage Course Projects

+

Foresting Optimal Trading Positions for Commodities New York, NY Keywords: Time Series Analysis, Signal Processing, Regression Apr 2018–Present

- Conducted filtering down signals for Rolling Futures using SVD

- Built linear regression model; Validated by walking-forward validation; Tested generic on Oil, Sugar, Copper, Gold, Natural Gas

+

Object-oriented Image Deblurring Pipeline New York, NY Keywords: Segmentation, Super-Resolution, SRGAN, Tensorflow Mar 2018–Present

- Image objects segmentation by Single Shot MultiBox Detector(SSD); image super-resolution reconstruction by SRGAN

+

Enhanced Seq2Seq Model for Automatic Text Summarization (Capstone Project) New York, NY Keywords: Natural Language Processing & Understanding, Hybrid Seq2seq Neural Network, Pytorch Oct 2017–Dec 2017

- Performed a semantic-encouraged seq2seq model with self-gated encoder, attention mechanism, and semantic measurement term; achieved high semantic relevance between summaries and source texts (ROUGE-1/2/L: 24.3, 12.3, 33.7)

- Constructed a two-stage hybrid seq2seq bi-directional Recurrent Neural Network with GRU, coverage mechanism, and prob- ability unit; the model can be viewed as a balance between extractive and abstractive approaches (ROUGE-1/2/L: 38.2, 18.4, 41.1)

+

Automated Scoring System for Essay New York, NY

Keywords: Natural Language Processing, LSTM, CNN, Attention Mechanism, Pytorch, Keras Oct 2017–Dec 2017

- Conducted research on 8 widely-used automated essay scoring models from research paper in Pytorch and Keras

- Investigated effects of mechanisms and architectures in networks, including LSTM, Bi-LSTM, CNN, attention mechanism, pooling functions, etc.

+

Automatic Music Genre Classification System New York, NY Keywords: Machine Learning, Multi-label Classification, Ensemble Classifier Feb 2017–May 2017

- Built multi-label prediction models with Random Forest and SVM (F-score: 0.303)

- Improved performance with Recurrent Neural Network(RNN), Convolutional Neural Network(CNN), and Gated Recurrent Unit(GRU) (F-score: 0.458)

+

Investigation on New York Crime Open Data New York, NY Keywords: BigData, Cloud Platform, Clustering, Feature Extraction, Visualization Feb 2017–May 2017

- Performed data cleansing and normalization using SQL

- Used PySpark to detected patterns with techniques like K-means and SVD on AWS EC2 and S3

- Produced data visualization on identified patterns with Matplotlib in Python, Tableau and D3.js



Contact this candidate