Ghizlaine Bennani
******************@*****.***
EDUCATION
M.S Analytics (USF, San Francisco)
Jul. 2015 – Aug.2016
Select Courses: Machine learning, Data Acquisition, Relational Databases, Time Series, Linear Regression, Business Strategies, Data Visualization, Distributed Computing, NoSQL Databases.
M.S Civil Engineering (EPFL, Switzerland)
Feb. 2013 – Aug. 2014
Major: Civil Engineering.
Minor: Management Technology & Entrepreneurship.
MS Thesis completed at UC Berkeley under Industrial Engineering department.
B.S Civil Engineering (EPFL, Switzerland)
Sep.2009 – Feb. 2013
WORK EXPERIENCE
Data Scientist Intern (ChannelMeter, San Francisco, CA)
Dec. 2015 – Jul.2016
-Developed an algorithm using unsupervised/supervised techniques to cluster similar channels and videos based on performance and content metrics to generate personalized targeted Multi Channel Networks. Techniques used: Principal Component Analysis, Feature engineering, Natural Language Processing, Sentiment Analysis, Spectral and Hierarchical Clustering.
-Predicted how many views a video will get before it is created. Technique used: Random Forest,Adaboost, Support Vector Regression
PROJECTS
MS Thesis Project (UC Berkeley)
Data Analysis, Logistic and Supply Chain
Optimization Network
Feb. 2014 – Aug. 2014
Created a framework for Demand Forecasting, Aggregate Planning and Inventory Management using XL Stats.
Simulated the effect of demand uncertainty on each stage of the network using Bootstrapping and Monte Carlo
techniques. Tools used: XLStats, Static Plus, RS Platform. SKILLS
R, Python, SAS, SQL, Postgresql, Machine Learning, Time Series, AWS, Spark, D3, Distributed Computing.
MS Analytics Projects (USF)
Tweet Popularity Analysis
Analyzed non textual data to predict and define what makes a tweet popular. The data was pulled from
Twitter API and transferred into Postgres database. The analysis was established via hypothesis testing and several regression methods to define the most
important factors that contributes into tweets
popularity. Tools: Python, SQL-Postgres, R.
Predicting American Football Plays
Built a classifier to predict whether a play is a passing or running play using data from the 2000-2012 NFL
seasons. via CV, the techniques tested are: Logistic Regression, SVM, KNN, Decision Trees, Random
Forest, Gradient Boosted Trees.
https://github.com/USF-ML2/rushing_for_insights.git Movie Review Sentiment Analysis
Classified movie reviews as positive or negative with 85% accuracy using Naïve Bayes algorithm
(implemented from scratch) and a 10-fold
cross-validation. Tools: Python, Spark
Regressors Package in Python Scikit Learn
Created Python library for fitting various regressors, extracting stats, and making plots. Tools: Python, Scikit Learn.
https://regressors.readthedocs.org/en/latest/readme. html
Time Series Analysis on Canada's Inflation
Rate
Applied a Box-Jenkins model with volatility clustering on Canada’s inflation rate data. Applied stationarity tests, classification theorem on ACF/PACF and tested for ARCH/GARCH effects by using several time
series/statistical packages. Tools: R
Implementation of Collaborative Filtering
Algorithm on Netflix Ratings Dataset
Implemented Collaborative Filtering algorithm on
Netflix ratings dataset from scratch to recommend
movies to specific user based on the most similar
users taste. Used Pearson Correlation as the similarity metric. The algorithm recommended similar movies
with an accuracy of 90%. Tools: Python.
LANGUAGES
Fluent in English, French and Arabic.