Post Job Free
Sign in

Data Scientist

Location:
Austin, TX
Posted:
December 21, 2020

Contact this candidate

Resume:

Omar Hasan Mohiuddin

+1-972-***-**** ****.*.**********@*****.***

linkedin.com/in/omar-mohiuddin-70b80b191 github.com/arrow2851 Aspiring Data Scientist with experience of dealing with more than million rows of data. My passion lies in using Machine learning and Data Visualization techniques to solve challenging problems. Excellent communication skills, teamwork, indomitable attitude, and an ability to learn quickly are my strengths. EDUCATION

The University of Texas at Arlington December 2020 M.S., Computer Science

Osmania University, India May 2019

B.E., Computer Science and Engineering

TECHNICAL SKILLS

• Data Analysis : R( caret, ggplot2, dplyr), Python(numpy, scikit-leean, matplotlib, keras, tensorflow, pandas), Tableau, SQL, Statistics

• Machine Learning : Supervised and Unsupervised Learning, Linear/Logistic Regression, Classification, Clustering, Decision Tress, KNN, Association Rule Mining, Tensorflow and Keras.

• Databases : MySQL, PostgreSQL, Oracle, SQLite

• Productivity : MS Office, MS Excel

PROJECTS

Recommender System using Association Rule Mining [Python, FP-growth, discord, scikit-learn, pandas]

• Created a recommender system for users of the website MyAnimeList for recommending TV shows.

• The dataset was collected using a hand-coded web scraper with each user’s list as a transaction for the algorithm.

• Using FP-growth, the rules that had antecedents similar to the target user’s profile were recommended the consequent with controlled rule size to balance between the quality and quantity of recommendations.

• A Discord Bot was created for live testing and a user survey for over 60 users showed an average precision of 86%. Census Data Analysis and Classification [R, caret, e1071]

• Applied classification techniques such as decision trees using GINI and Information Gain, and Naïve Bayes on census data with information such as age, profession, ethnicity, country of origin, etc.

• The classification allowed for prediction of Income brackets based on the person’s details and produced an accuracy of over 95% in each classification technique.

• Comparison and Analysis of the differing results using different methods and metrics of classification. Texas Weather Data Analysis and Station Clustering [Python, scikit-learn, Tableau]

• Extracted and Texas’ hourly weather data from the between the years 2008 to 2010.

• Applied Clustering using K-means for each year and assigned the respective stations to the cluster.

• Analyzed the trend in yearly change in weather across for each month by plotting the respective stations of each cluster on a map of Texas using Tableau.

IMDB Movie Data Association Rule Mining and Analysis [R, dpylr, arules, arulesViz, Tableau]

• Fetched and cleaned data from the IMDB database of movies over the past 12 years as baskets with genres, the release year and actors of the movie as a transaction.

• Applied Apriori rule mining on the resultant data and analyzed the rules for varying confidence and support to find non-generic interesting rules.

• Visualized the resultant rules using as a Tableau dashboard according to their confidence.



Contact this candidate