Post Job Free
Sign in

Engineering Software Engineer

Location:
San Jose, CA
Posted:
July 31, 2017

Contact this candidate

Resume:

Weijun Qian

412-***-**** *********@*****.*** www.qwjlegend.com San Jose CA

EDUCATION

Carnegie Mellon University, Pittsburgh, PA Dec 2016 Master of Science, Engineering & Technology Innovation Management, GPA 3.73 Courses: Cloud Computing, Intro to Machine Learning, Algorithm and Advanced Data Structure, NoSQL, Statistical Method Tongji University, Shanghai, China Jun 2015

Master of Engineering, Industrial Engineering Operations Research Zhejiang Sci-Tech University, Zhejiang, China Jun 2012 Bachelor of Engineering, Industrial Engineering Operations Research WORK EXPERIENCES

RistCall LLC, Pittsburgh, PA Data Analyst Summer Intern May - Aug 2016

• Extracted 30 million nurse call station logs of patients’ requests from hospital database using MySQL.

• Conducted linear regression analysis on extracted datasets using Python Pandas and confirmed a negative relationship between responsiveness from nurses and effects of patients’ treatments. Continental Automotive System, Shanghai, China Software Engineer Summer Intern Jun - Oct 2015

• Designed and developed web-based product query system in Python, Flask using SQL server.

• Built non-linear programming model to simulate production scheduling problem with the objective of reducing cycle time.

• Conducted feasibility study with manufacturing group and resolved job shop scheduling issues by implementing the Genetic Algorithm and reduced cycle time by 14%.

PROJECTS

Data Pipeline Construction for Analyzing NYC Taxi Trip Data Jun - Jul 2017

• Cleaned and filtered 1.2 billion NYC Taxi Trip Data (300GB) and stored them in S3 buckets.

• Built and configured data pipeline based on AWS resources using Terraform and Ansible.

• Designed MapReduce program and deployed it on an auto scaling group to generate the statistics of the taxi trip data including the number of trips within different boroughs, ranges of distances, fares and time periods.

• Scheduled and tracked tasks on each node by SQS to read the input by byte offset.

• Built MapReduce program into Docker image and pushed to ECS registry as an alternative to facilitate scaling out.

• Aggregated and stored the output of reducer into DynamoDB and used it for Bokeh visualization on a group of web servers and balanced the incoming traffic using ELBs.

Movie Recommender Design with Hadoop MapReduce Apr - May 2017

• Formulated a user rating matrix and a co-concurrence matrix based on Netflix raw data set with 480k users, 17k movies and over 100 million ratings.

• Merged two matrices using item-based collaborative filtering algorithm to compute the movie recommendation list and deployed the jobs on AWS Hadoop multi-node cluster.

PageRank Algorithm on Twitter Social Network Mar - Apr 2017

• Implemented page rank algorithm based on Twitter datasets with 11 million user profiles and 85 million social relations.

• Formulated the relations between different users using transition matrix, calculated each user's rank value through 30 iterations until convergence on EMR cluster.

• Visualized the social network graph based on the resulting PageRank matrix using Node.js. Seizure Prediction based on EEG Time Series Dataset Jan - Feb 2017

• Cleaned and manipulated raw dataset of 74 GB EEG signals, extracted features including wavelet packet coefficients, Hurst exponent and basic statistics by logistic regression plus LASSO.

• Built predictive model based on extracted features using random forest algorithm and performed cross group validation over training dataset to prevent over fitting.

• Tuned model parameters by conducting Bayesian optimization and adopting gradient boosting to improve accuracy by 8%. Convolutional Neural Network for MNIST Image Classification Sept -Dec 2016

• Designed convolutional neural network (LeNet) with 4 convolutional layers and 4 max pooling layers to process an input image of dimension 28 by 28.

• Implemented exponential linear unit activation function to speed up learning and improve classification accuracy.

• Trained LeNet on Matlab to classify subset of MNIST dataset and reached classification accuracy of 97.6%. SKILLS

Languages: Python, Java, SQL, Shell, Matlab Databases: Cassandra, Redis, MySQL. Back-end: Docker, Pig, Hive, Kafka, Spark, Hadoop, Terraform, Ansible Machine Learning Models: Decision Tree, Random Forest, Naïve Bayes, Support Vector Machine, Logistic Regression, Linear Regression, Convolutional Neural Network, K-means, Ada-Boost, PCA



Contact this candidate