Post Job Free
Sign in

Data Scientist, Data Analyst

Location:
San Francisco, CA
Posted:
March 20, 2017

Contact this candidate

Resume:

Ghizlaine Bennani

415-***-****

******************@*****.***

EDUCATION

M.S Analytics (USF, San Francisco)

Jul. 2015 – Aug.2016

Select Courses: Machine learning, Data Acquisition, Relational Databases, Time Series, Linear Regression, Business Strategies, Data Visualization, Distributed Computing, NoSQL Databases.

M.S Civil Engineering (EPFL, Switzerland)

Feb. 2013 – Aug. 2014

Major: Civil Engineering.

Minor: Management Technology & Entrepreneurship.

MS Thesis completed at UC Berkeley under Industrial Engineering department.

B.S Civil Engineering (EPFL, Switzerland)

Sep.2009 – Feb. 2013

WORK EXPERIENCE

Data Scientist Intern (ChannelMeter, San Francisco, CA)

Dec. 2015 – Jul.2016

-Developed an algorithm using unsupervised/supervised techniques to cluster similar channels and videos based on performance and content metrics to generate personalized targeted Multi Channel Networks. Techniques used: Principal Component Analysis, Feature engineering, Natural Language Processing, Sentiment Analysis, Spectral and Hierarchical Clustering.

-Predicted how many views a video will get before it is created. Technique used: Random Forest,Adaboost, Support Vector Regression

PROJECTS

MS Thesis Project (UC Berkeley)

Data Analysis, Logistic and Supply Chain

Optimization Network

Feb. 2014 – Aug. 2014

Created a framework for Demand Forecasting, Aggregate Planning and Inventory Management using XL Stats.

Simulated the effect of demand uncertainty on each stage of the network using Bootstrapping and Monte Carlo

techniques. Tools used: XLStats, Static Plus, RS Platform. SKILLS

R, Python, SAS, SQL, Postgresql, Machine Learning, Time Series, AWS, Spark, D3, Distributed Computing.

MS Analytics Projects (USF)

Tweet Popularity Analysis

Analyzed non textual data to predict and define what makes a tweet popular. The data was pulled from

Twitter API and transferred into Postgres database. The analysis was established via hypothesis testing and several regression methods to define the most

important factors that contributes into tweets

popularity. Tools: Python, SQL-Postgres, R.

Predicting American Football Plays

Built a classifier to predict whether a play is a passing or running play using data from the 2000-2012 NFL

seasons. via CV, the techniques tested are: Logistic Regression, SVM, KNN, Decision Trees, Random

Forest, Gradient Boosted Trees.

https://github.com/USF-ML2/rushing_for_insights.git Movie Review Sentiment Analysis

Classified movie reviews as positive or negative with 85% accuracy using Naïve Bayes algorithm

(implemented from scratch) and a 10-fold

cross-validation. Tools: Python, Spark

Regressors Package in Python Scikit Learn

Created Python library for fitting various regressors, extracting stats, and making plots. Tools: Python, Scikit Learn.

https://regressors.readthedocs.org/en/latest/readme. html

Time Series Analysis on Canada's Inflation

Rate

Applied a Box-Jenkins model with volatility clustering on Canada’s inflation rate data. Applied stationarity tests, classification theorem on ACF/PACF and tested for ARCH/GARCH effects by using several time

series/statistical packages. Tools: R

Implementation of Collaborative Filtering

Algorithm on Netflix Ratings Dataset

Implemented Collaborative Filtering algorithm on

Netflix ratings dataset from scratch to recommend

movies to specific user based on the most similar

users taste. Used Pearson Correlation as the similarity metric. The algorithm recommended similar movies

with an accuracy of 90%. Tools: Python.

LANGUAGES

Fluent in English, French and Arabic.



Contact this candidate