Resume

Sign in

Data Science

Location:
Union City, CA
Posted:
May 14, 2020

Contact this candidate

Resume:

Muhammad Ali Valliani

510-***-**** adc7rn@r.postjobfree.com Union City, CA 94587 https://github.com/mavalliani/ Profile

Results-oriented Machine Learning professional with strong experience in Deep Learning. I have 3 years of experience, with strong research skills and significant work in quantitative and language models. Career Highlights

• Data Science and Machine Learning tasks using Regression, Classifications, Trees, Support Vector Machines, Kernels.

• Deep Learning. Convolutional Nets (CNN). Recurrent Nets (RNN). LSTM. Transfer Learning. Adversarial Nets (GAN). etc.

• Sequence Models. (Natural Language Processing. Word Embeddings. Transformers. – attention based)

• Unsupervised Learning. (Mixture Models. LDA. Clustering.)

• Research (quick to understand/implement work from advanced publications, published scholarly blog posts)

• Hyper-parameter tuning. Bagging. Boosting. Regularization.

• Mathematical Optimization. (convex optimization. gradient descent. Newton’s Method.)

• Algorithms. Extensive programming knowledge.

• Experienced in working with GPUs, CUDA, AWS.

SKILLS

Core: Statistical Analysis. (Time Series. Bayesian methods. Stochastic processes.); Data Science. (Machine Learning. Deep Learning. Natural Language Processing.) Tech stack: Python. R. Spark. SQL.

Data Science toolkit: ML (Scikitlearn.), DL. (Tensorflow. Keras. Pytorch), NLP. (NLTK. Gensim. Spacy. Embeddings.) Methods: Mathematical Optimization. Ensemble models. applied research. EDUCATION

MS Statistics California State University East Bay, Hayward, CA 08/2018 – 05/2020

(Data Science) CGPA: 4.00/4.00

BS Engineering GIK Institute of Engineering Sciences and Technology 08/2010-06/2014 WORK EXPERIENCE

Machine Learning Fellow Fellowship AI 12/2019 – Present

Crawled food images and modeled a Raw Food classifier for prototyping a smart oven. Tensorflow - CNN using Resnet50.

Detection of out-of-distribution data using a model-agnostic gateway module. Used Pytorch and FastAI.

Developed and deployed language and quantitative models on AWS and Amazon Sagemaker. Rest APIs. Data Science Intern Branch Metrics Inc (Redwood City, CA) 05/2019 – 8/2019

Worked on a multi-vertical, general purpose entity search using Elastic Search as entity data store. Indexing and Ranking.

Research and develop NLP/NLU component of entity extraction, semantic search and sentiment analysis with statistical learning for mobile apps with high performance. Frameworks used: NLTK, SpaCy, Gensim.

Provide artificial intelligence, deep learning, machine learning and NLP solutions for knowledge graph.

Generate over 2 million labeled relations for location data using Common Crawl data Wikipedia data.

Developed and optimized scoring algorithms for queries to match with relevant verticals.

Statistical Language models for query understanding (NLP methods), Ranking functions (Extended Vector Space Models). Machine Learning Engineer Marketchal Private Limited 06/2015 - 03/2018

Set up machine learning pipelines, Rest APIs and data architectures to processing million+ queries a day.

Predictive modeling of user behavior. Models used: Boosting, Bagging, Feature Engineering, and Clustering (unsupervised).

Optimized campaigns content and bidding using Time series, Quantitative and Sequence Models (NLP). Ensemble Methods.

Shell scripting on Linux including server deployment and automation scripts.

Tools and frameworks used are scikitlearn, Tensorflow, Pytorch, MySQL, PostgreSQL and deployment on AWS. MAJOR PROJECTS

Machine Learning

● Semantic Similarity of Sentences. Methods used: Cosine Similarity with Glove, Smooth Inverse Frequency, Word Movers Difference, Sentence Embedding Models (Infersent and Google Sentence Encoder), ESIM with pre-trained FastText embedding. Best performing method on Quora Question pair dataset was an Ensemble method with 0.27 log-loss. https://github.com/mavalliani/Semantic-Similarity-of-Sentences [October 2019]

● Feature Engineering and Analysis of News data to Predict Stock Price Movements. Model: XGBoost optimized up to an RMSE of 0.7. Link: https://github.com/mavalliani/News-data-for-stock-prediction [October 2018] Research

● Classification (kNN model) of human activity based on data from Inertial Measurement Units (with Dr. Bradford Bennett – CSU East Bay). Prediction accuracy of 98% was achieved. Link: : https://github.com/mavalliani/human- activity-classification-research

● Playing Poker deterministically. Algorithmically unwrapping scenarios where it is ‘risk-free’ to bet. Link: https://github.com/mavalliani/deterministic_poker Independent

● I maintain a Video Book Publishing site: Jamnosh (https://jamnosh.com) for my love of reading.

● Working on generating fiction stories (text data) using advanced NLP methods (Long term project). Honors & Awards

● STEM Scholar - Institute of STEM Education CSUEB (Distinguished Student in Statistics)

● Awarded a grant of 20 million Chilean Pesos by CORFO Chile for TOKEN (Startup Chile Portfolio Company).

● Teaching Associate – Statistics (CSUEB): Taught Probability, Gaussian and related distributions, Statistical tests and Regression to a class of 100+ students.

● Dean Honor Student at GIK Institute.



Contact this candidate