Izzet Sozucok Data Scientist
Arlington, TX
+1-682-***-**** •adbs4v@r.postjobfree.com in izzet-Sozucok
Objective
Data Scientist with 6+ years of experience executing data-driven solutions to increase efficiency, accuracy, and utility of internal data processing by using many tools such as MapReduce, SPARK, time series analysis, Hierarchical Bayes; or Machine learning techniques such as Svm,Decision Trees, Boosting, Random Forests. Experienced at creating data regression models such as logistic, linear and kernel, using predictive data modeling, and analyzing data mining algorithms to deliver insights and implement action-oriented solutions to complex business problems by using Python, R, Hadoop, SQL and NoSQL ( Cassandra,hive and pig) with vitualization tools; Tableau,seaborn and matplotlib
Technical Skills
Computer language: Python, SQL, Java, Html Version Control System: Github gitlab, Jenkin
Markdown: Latex, Jupyter Notebook, Microsoft office Databases: Oracle, postgres, Cassandra,hive,pig
Data Testing : Rest API,Selenium, Cucumber Java Others: Matlabs, Linux,Windows
Statistical Prog: R,SAS, Apache Spark and Hadoop for big data
Education
Doctor of Philosophy (PhD) in General Statistics TX,USA
University of Texas At Arlington,, GPA:3.71/4.00 2014–2019
Master of Science (Ms) in Applied Mathematics Izmit, Turkey
Kocaeli University, GPA:3.78/4.00 2009–2011
Bachelor of Mathematics Izmit, Turkey
Kocaeli University, GPA:3.00/4.00 2004–2008
Experience
AI- Machine Learning Developer Sep 2019 – Present
Verizon Wireless Irving,Texas
Built precautions algorithm using k-nearest neighbors, decision tree and logistic regression to minimize fallout orders by numpy. pandas, sklearn seaborn and matplotlib.
Designed and developed real time recommendation engine to rank sales leads for upsell opportunities
Refined personalization algorithms for 10M+ customers on relational postgres and nonrelational Cassandra databases using pgcopyg2 and Cassandra driver using spark, Hadoop ecosystem.
Transformed raw data into postgres SQL with custom-made Spark on AWS linux server to prepare unruly data for machine learning using tableau and seaborn for EDA
Experience with predictive modeling tools, machine learning tools (scikit-learn, PyTorch, Spark, TensorFlow, Keras, Theano etc.), and statistical data analysis such as linear models, multivariate analysis, stochastic models, logistic regression and sampling methods using linux system in AWS.
Data Instructor May 2018 –May 2019
University of Texas at Arlington Arlington, Texas
Understood deeply collection, analysis, presentation, and interpretation of data. Analysis includes descriptive statistics, probability, relationships between variables and graphs, statistical models, hypothesis testing, inference, estimation, correlation, regression and confidence intervalsUnderstood deeply collection, analysis, presentation, and interpretation of data using all libraries of python such as scikit-Learning,Numpy, Seaborn,Matplotlib Spark, PyTorch,TensorFlow, Keras,and Hadoop for Big Data and Pandas esc.
Analysis includes descriptive statistics, probability, relationships between variables and graphs, statistical models, hypothesis testing, inference, estimation, correlation, regression and confidence intervals
Data Scientist Sep 2017- May 2018
Irving, TX
Farmers Insurance Group Inc.
Environment: Java, R, Regresion, Html, Boosting, Random Forests, NLP, and anomial detections.
Applied a novel technique to predict the failure time of devices and to try to predict the “age-at-death” distributions under censoring data using R, Python and Matlab.
Experienced design experiments, and test feasibility of proposed actions to determine probable outcomes using a variety of tools & technologies such as scikit-learn, PyTorch, Spark, TensorFlow, NLP, with python.
Created new method to reduce dimension of predictors for big data by using PCA in SQL, Python and Apache Spark, Hadoop for applying NLP.
Developed intricate algorithms based on deep-dive statistical analysis and predictive data modeling that were used to deepen relationships, strengthen longevity and personalize interactions with customers.
Analyzed and processed complex data sets using advanced querying, visualization and analytics tools.
Graduate Researching/Teaching Assistant Arlington, TX
University of Texas at Arlington Dec 2016–May 2018
Taught Calculus 1/2,Algebra, statistics courses, became tutor Algebra/Elementary Statistics.
Detailed achievements:
Achievement 1: Understood deeply collection, analysis, presentation, and interpretation of data. Analysis includes descriptive statistics, probability, relationships between variables and graphs, statistical models, hypothesis testing, inference, estimation, correlation, regression and confidence intervals by using phyton and R.
Achievement 2:Taught ability of concepts of limit, continuity, differentiation and integration applications of these concepts.
Achievement 3:Taught many statistical courses and their applications such as machine learning; Classification, cluster, K-mean, Prediction GLM regression, Lasso Regression Ridged Regression, Hierarchical Cluster, reducing dimensions (PCA, AIC, Corrected AIC, BIC, Lasso).
Sub-achievement (a):Learned the applications of visualization techniques such as ggplots, python scikit-learn matplotlib and experienced with unstructured data sets: text analytics, image recognition etc.in python and R. Also insighted deeply spark and Hadoop for big data.
Data Analists Kocaeli, Turkey
Isik Tech Inc Sep2012–May 2015
Monitored incoming feeds to ensure the timely arrival of necessary data-sets.
Understood all versions of data and price validation reports and applied necessary edits, changes and deletions as required.
Read, interpreted and executed various reports generated daily which assisted in the maintenance of various current and historical databases.
Maintained market holiday schedules and time zone changes for all covered market.