Resume

Data Scientist

Location:

Palo Alto, CA

Posted:

June 10, 2017

Contact this candidate

Resume:

Jale Dinler

ac0sga@r.postjobfree.com 608-***-**-** Palo Alto, CA

http://www.linkedin.com/in/jaledinler

jdinler (skype)

EDUCATION

UNIVERSITYOFWISCONSIN

MADISON

MA. IN MATHEMATICS

**** - **** *****on, WI

KOCUNIVERSITY

MSC. IN MATHEMATICS

2010 - 2013 Istanbul/Turkey

BOGAZICI UNIVERSITY

BSC. IN COMPUTER ENGINEERING

2004 - 2009 Istanbul/Turkey

LINKS

Github:// jaledinler

LinkedIn:// jaledinler

COURSEWORK

GRADUATE

Building DeepNeural Networks

Data Science

Machine Learning

Nonlinear Optimization

Integer Optimization

SKILLS

•Java •Python •Matlab

•C •Tensorflow •Weka

•Scikit •Google Cloud

•Magellan

AWARDS

2016 Honored Instructor Award

at UW-Madison

2013 Full Scholarship from

UW-Madison

2010 Full Scholarship from

Koc University

2004 337th/1.2million students in

National University Entrance

Exam

PROJECTS

EXTRACTING STRUCTURED INFORMATIONFROMRAWTEXT

DATA An information extractor to extract all person names from300 Pitchfork Reviews that is collected from a dataset on Kaggle was imple- mented. Scikit learn package was used to create different MLmodels by using classifiers (e.g. SVM, RandomForest, Decision Tree, Logistic Regression, Linear Regression). After applying these classifiers with cross validation to train data, the classifier that gives the highest preci- sion was selected to be applied to test data.

NEURAL NETWORK (NN) I appliedNNwith one layer of hidden units to predict protein secondary structure. One of the project require- ments was to implement NN library from scratch. I explored the effect of number of hidden units and number of epochs on accuracy and over- fitting. I also experimentedwithmomentum termand weight decay to see their effects on the convergence rate. To handle nominal features, after exploring possible encoding strategies, I decided to use one-of-k encoding.

ENTITYMATCHING (EM) Magellan (an EM tool developed at UW- Madison) is used for the task. Two tables are extracted fromPitch- fork Reviews Data (collected fromKaggle) andDiscogs albumdata. To obtain a set of candidate tuple pairs, several blockers were used. A small sample of these tuples is labeled (match, no-match) to train and test matchers like random forest, svm, naive bayes, logistic regression, linear regression and decision tree. The best matcher (99.4%preci- sion, 99.3% recall) was applied to find all possiblematches and these matches are combined into a single table.

NAIVE BAYES (NB) ANDTREE AUGMENTEDNAIVE BAYES (TAN) I appliedNB and TAN classifier on a lymphography domain. One of the project requirements was to implement these libraries from scratch. To find themaximal spanning tree in TAN, Prim’s algorithmwas implemen- ted. To analyze the statistical significance of the difference between NB and TAN, Stratified 10 fold cross validation and then a paired t-test was used.

DECISIONTREE (DT) I appliedDT to predict whether a person has diabetes or not. One of the project requirements was to implement an ID3-like decision tree from scratch. For numeric feature splits, a very similar algorithm to the one in C4.5was implemented. To avoid over- fitting, a threshold for the size of node is set before starting to grow the tree. Change in accuracy was analyzed for different values of this threshold.

K-NEARESTNEIGHBOR (K-NN) ANDK-NNSELECT I applied k-NN learner to analyze the quality of wine. One of the project requirements was to implement k-NN from scratch. For k-NN select, leave-one-out cross validationwas implemented to select the value of k. The one that results in theminimal cross-validated error was selected. EXPERIENCES

2017 - Research Engineer (Intern) at Docomo Innovations, Inc. 2013 - 2017 Teaching Assistant at University ofWisconsin - Madison 2011 - 2013 Teaching Assistant at Koc University

Contact this candidate