Post Job Free

Resume

Sign in

Python Data

Location:
West New York, NJ
Posted:
November 27, 2020

Contact this candidate

Resume:

Sami Ali

* ****** ***, ****** ****, NJ ***** adh6js@r.postjobfree.com

+1-201-***-**** Github: https://github.com/samialisayed Education

City College of the City University of New York

Data Science Master, Grove Engineering School (GPA: 3.67): June 2021 University of Science and Technology at Zewail City Bachelor of Science in Biomedical Sciences (Computational Biology Concentration): June 2017 Relevant Courses

Introduction to Data Science Applied Statistics Applied Machine Learning Big Data and Scalable Computation Database System Visual Analytics Computer Vision Computer Science I & II Skills

Programming languages: Python, R, MATLAB, SQL

Tools: Anaconda, Hadoop, Spark, MySQL, SQLite, Visual Studio code, Plotly, Tableau, SPSS, PyMOL, Spyder, Microsoft Office

(Word, Excel, PowerPoint)

Interpersonal: Teamwork, Leadership, Hardworking, Adaptable, Organized, and Responsible Employment

Teaching Assistant University of Science and Technology at Zewail City August 2017-August 2018

Assisted and mentored students in groups and on an individual basis.

Provided educational materials, including academic papers and online courses.

Helped professors in creating quizzes and activities, developing courses’ curriculum, and grading. Customer Service Representative Vodafone Group Plc. May-September (Seasonal 2015, 2016)

Handled numerous calls daily including signing up new customers, retrieving customer data, presenting relevant product information, and cancelling services.

Remained courteous and calm, even during moments of customer dissatisfaction. Projects

Gathering statistics on the number of parking violations (tickets) per street segment in NYC over the past 5 years. For each street segment, we have the total number of parking violations for each year from 2015 to 2019 and the rate that the total number of violations change over the years using Ordinary Least Squares. The input data was Parking Violations Issued-Fiscal Year 2015-2019 (around 10GB in total) and NYC Street Centerline (around 650k segments). Violation data has no geospatial location, but a street address was provided. The centreline data provided for each segment geometries and house number range. Writing a spark program in python to join, find matches, and compute the needed metrics. Running on cluster with exactly 30 cores (6 executors). (2020)

Predicting House Sale Prices in Amos, Texas. The dataset was from Kaggle contains almost 3000 observations and 80 features. Dealing with missing values, converting categorical features (nominal by dummy variables and ordinal by ranked numbers), detecting outliers by Isolation Forest Algorithm, and discovering the features importance to the data as a whole as well as by neighbourhood by several features rating algorithms. Creating two models for classification: a support vector model and a random forest classifier. Exploring several models for a regression including: linear regression, lasso regression, random forest regression, and SVM regression. All models achieved similar r-squared metrics on the validation data between .86 and

.89. Used Python. (2020)

Visualization for temperature data collected from Geostationary Operational Environment Satellites GOES16 and GOES17. Creating 3 different interactive plots to compare the two satellites’ measurement through 10 consecutive days. The first two were a histogram and line chart with ability to select a view for 10 minutes, 1 hour, 1 day or for all the data. The last one was a scatterplot in which each time point was a 10 minute measurement with attached histogram image of the mean that appears when we hover over the point. Used python. (2019)

Error analysis of in situ sea surface temperature data. Used python, Pandas and Matplotlip, Cartopy and Scipy to first load dataset for a ship and a drifting buoy and clean data. Plotting some plots such as their tracks on the map, identifying outliers and looking for a possible influence of various factors, like a season or a time of the day, or platform history. Doing statistical tests to study the measurements of satellites and in situ.(2019)

Identifying the correlation between amino acids’ attributes with their surface accessibility inside proteins using machine learning approach. Used Python, PyMOL and PDB files to extract some features such as solvent accessible surface from a database from 13,100 JSON files. Used Pandas, Scipy, and Scikit-learn for finding correlated features with the surface area, converting categorical variables, and predicting the area using multiple linear regression algorithm. Used Matplotlib to visualize relationships and exploring. (2016-2017) Extracurricular Activities

Organizing committee member in Zewail City Conference & Exhibition on Biomedical Sciences (ZCEBS) such as making a list of materials we needed and putting a plan to manage the time and the sequence of sessions.

Activities committee member in student union such as organizing soccer, tennis and chess tournaments...etc., making trips, developing rooms for arts and religions.

Volunteer in the project (knowledge is power) of life makers organization. We were going to villages where there are many illiterate people to speak with them about the importance of reading and writing and encourage them to register in literacy classes that we were also organizing.

Scientific committee member in Ihsan family such as teaching python and programming concepts for fresh undergraduate students and organizing extra session for TAs to help students.

Member, Association for Computing Machinery (ACM).



Contact this candidate