Post Job Free

Resume

Sign in

Data Analyst / Data Scientist / BI / Tabluea / Python

Location:
Riyadh, Saudi Arabia
Posted:
March 10, 2024

Contact this candidate

Resume:

MUHAMMAD UMAR

+966********* Riyadh ad38pa@r.postjobfree.com LinkedIn Google Scholar GitHub

SENIOR DATA SCIENTIST/ REMOTE SENSING ENGINEER

Senior Python Developer Data Science Google Earth Engine AWS+GCP Cloud Computing ML/AI/DL Accomplished Senior Data Scientist / Senior Python Backend developer and Remote Sensing Engineer with a proven track record in leveraging data science methodologies to extract actionable insights and drive strategic decision-making. SUMMARY

Experienced data scientist with 5+ years in python development.

Developed advanced applications and tools using state-of-the-art technologies such as Langchain, Llamaindex, Huggingface, HayStack, and H2o for Large Language Models.

Customized proprietary (e.g., ChatGPT) and open-source models (e.g., llama2, mistral) to enrich knowledge and create intelligent search applications.

Worked with many National and International organizations such as NASA, REDD+ and CABI in collaboration with IMF and the University of Manchester, respectively.

Published 6 research articles & proceeding papers in multiple national and international journals. SIGNATURE SKILLS

• Python Programming

• Backend Development

• Django Rest Framework

• API Integration

• Database Management

• Agile Development

• Advanced Statistical Analysis

• Machine Learning Algorithms

• Data Mining

• Predictive Analytics

• Data processing/handling Technologies

• Data Visualization

• project documentation

• Experimental Design

• Cross-Functional Collaboration

CORE SKILLS

Language: Python, R, SQL Shell Scripting, Python Scripting Databases: MongoDB PostgreSQL MySQL SQLite No-SQL Framework: Django Rest Framework (DRF) Flask FastAPI Scrapy Libraries & Skills: ML Deep Learning (DL) Linear Lang. Models (LLMs) Lang Chains TensorFlow PyTorch GeoPandas Scikit- learn Seaborn NumPy Pandas lime Data Modeling NLP Statistics Pattern recognition Linux Git Tableau

ETL pipeline with Python Google Earth Engine, GEE-Python Web scraping (Scrapy, BeautifulSoup4, Selenium) Apache-Airflow Kafka Dagster Snowflake Redis (cache handling + background tasks) Celery Locust performance testing

DevOps Amazon Web Services (AWS), Google Cloud Platform (GCP) Kubernetes Docker Terraform Jenkins Maven Ansible Chef Bash Scripting GitHub Actions SNS, SQS, RDS, S3 Route 53, EC2, Glue, Athena Sagemaker NOTABLE ACCOMPLISHMENTS

• Authored and published six research articles and proceeding papers across various national and international journals.

• Earned 32 certifications in data science, DevOps tools, and academic writing from esteemed e-learning platforms such as Coursera, DataCamp, and Data Science, showcasing a commitment to continuous learning and skill enhancement.

• Worked as a mentor at the International NASA Space App o Deliver lectures on Data Science like ‘How data can transform solutions on Earth and beyond?’ o Provide mentorship to the groups of students preparing projects for NASA Hackathons EDUCATION

Masters in GIS & RS (MS) – National University of Science and Technology (NUST), Islamabad – CGPA: 3.7 – 2017-2021 Bachelors in Environmental Science (BS) – Bahauddin Zakariya University (BZU), Multan – CGPA: 3.6 – 2013-2017 PROFESSIONAL CERTIFICATIONS

• Coursera E-learning platform

o Machine Learning specialization

o Deep Learning specialization

o Academic Writing specialization

o Data Science Specialization)

o Python Development

• Code with Mosh

o SQL

o Ultimate Django series)

PUBLICATIONS

• Trends of Aerosol Optical Thickness Using VIIRS S-NPP During Fog Episodes in Pakistan and India

• Predicting suitability of wheat and maize cultivation under future climate change scenarios in Pakistan

• Satellite Remote Sensing Data Types and Utility for AOT Mapping

• Investigating the flood damages in Lower Indus Basin since 2000: Spatiotemporal analyses of the major flood events

• Climate Change and Potential Distribution of Potato (Solanum Tuberosum) Crop Cultivation in Pakistan Using Maxent

• Vulnerability detection in source code using AI/ML techniques (under review).

PROFESSIONAL NARRATIVE

Senior Data Scientist/RS Engineer Kumi Analytics Pvt. Ltd. Oct 2023 – Present

• Geospatial-Intelligence Advancements -Development of Google Earth Engine (GEE): Utilize Python programming to generate diverse maps. Construct an artificial intelligence system for classifying maps, focusing on key categories like deforestation, forestation, and biomass.

• Robust Backend Infrastructure - Fast-API Backend Development: Employ Fast-API to manage and process multiple maps efficiently.

• Cutting-Edge Technologies Implemented: Leverage a comprehensive technology stack including Python, Google Cloud Platform (GCP), Google Cloud Service, Geo-Pandas, Pandas, Scikit-learn, NumPy, SQL, Fast-API, S3, GCS, and Bash. Senior Data Scientist/ML Engineer CyberShell Solutions Ltd. Jan 2021- Dec 2023

• Automated Data Scrapping Pipeline: Developed and maintained an automated data scrapping pipeline, extracting valuable information from databases. Implemented a seamless process to save the acquired data back into databases.

• Data Annotation Pipeline: Established an automated pipeline dedicated to preprocessing, analyzing, and annotating the scraped data. Successfully stored the enriched data in MongoDB, ensuring efficiency and accuracy.

• ML/DL CI/CD Pipeline: Constructed a robust Continuous Integration/Continuous Deployment (CI/CD) pipeline for machine learning and deep learning models. Integrated the modeling pipeline with data scrapping and annotation processes, optimizing the overall workflow.

• Django REST API Development and AWS Deployment: Developed Django REST API for seamless interaction with AI/ML models. Achieved successful deployment of AI/ML models on Amazon Web Services (AWS), showcasing expertise in cutting-edge technologies.

• Technologies Expertly Utilized: Proficiently utilized a comprehensive array of technologies including Python, Scrapy, Selenium, BeautifulSoup4, Pandas, Scikit-learn, NumPy, MongoDB, Lime, Django, DRF, Flask, FastAPI, Airflow, Redis, Docker, Kubernetes, AWS, MySQL, S3, EC2, Bash, SonarQube, Semgrep, Horusec, Bendit, and Insider, NLTK, SpaCy, and Hugging Face Transformers. Emp. Trainer for Data Science Boston Institute of Analytics June 2023 – Oct 2023

• Data Science and AI Training: Served as a Trainer for Data Science and Artificial Intelligence, imparting knowledge and fostering skill development.

ML Engineer/RA IGIS, NUST Islamabad Aug 2019 – Dec 2019

• Data Functions: Spearheaded data acquisition initiatives, sourcing satellite data via FTP, GPS data from field surveys, UAV drone data, and Spectro-radiometer inputs. Conducted thorough data analysis and preprocessing, laying the foundation for subsequent tasks.

• Executed supervised classification using Machine Learning (ML), excelling in cartography and mapping endeavors.

• Utilized Technologies: Demonstrated expertise in employing a diverse set of technologies, including Python, NumPy, Pandas, QGIS, Arcpy, bash, GPS, and Machine Learning.

• Masterful Problem Solver and AI Enthusiast: Expertise in machine learning algorithms, artificial intelligence, predictive analytics, data modeling, natural language processing (NLP), statistical modeling, Python programming, and Geographic Information Systems (GIS). Senior GIS Analyst / ML Engineer WWF Islamabad May 2018 – Oct 2018

• Carbon Credit Identification for Pakistan: Utilization of Landsat Dataset: In the pursuit of achieving Carbon Credit Identification for the entire country of Pakistan, the project harnessed the power of the Landsat dataset.

• Integration and Validation: The integration of satellite data with Google Maps was seamlessly executed, followed by a meticulous validation process. Millions of samples were collected through digitization to ensure accuracy.

• Application of Supervised Classification Techniques: The project utilized a random forest algorithm, and samples were meticulously acquired through manual referencing. To extract forest area data, state-of-the-art Supervised Classification techniques of Machine Learning were employed

• Technologies and Tools Utilized: The project leveraged a comprehensive set of technologies, including Python, Google Earth Pro, LULC classification, Data Cleaning, Data Sampling, Digitization, Modeling, and Machine Learning.

• Skills Demonstrated: The successful execution of this project showcased a diverse skill set, including proficiency in Problem Solving, Machine Learning, Algorithms, Artificial Intelligence (AI), Predictive Analytics, Data Modeling, Statistical Modeling, Python, Geographic Information Systems

(GIS).

Internship EPA Vehari June 2017 – Aug 2017

• Field Survey for Environmental Monitoring: Undertook extensive field surveys to monitor and assess environmental conditions.

• IEE Report Writing and Maintenance: Authored comprehensive reports following the Environmental Impact Assessment (EIA) guidelines and actively maintained them for accuracy and relevance.

• Advocacy for EPA at Public Hearings: Played a pivotal role in representing the Environmental Protection Agency (EPA) at public hearings, effectively communicating environmental concerns and proposed solutions.

• Data Entry and Digitization Proficiency: Demonstrated adeptness in data entry and digitization processes, ensuring the accurate recording and preservation of essential information.

PREVIOUS EXPERIENCE

Laboratory Research Assistant / Intern COMSATS Abbottabad June 2016 – Aug 2016 First Aid Intern Rescue 1122 Multan June 2015 – Aug 2015 PROJECT HANDLED

AI Chatbot for Facebook

Technologies used: Python, Flask API, DL models, authentication, serializers, Object-oriented programming (OOP), URL routing, AWS EC2, AWS Route53, Nginx, Gunicorn

Develop an Artificial Intelligent model to scans user code and predict vulnerabilities to cyber-attacks. Data Scraping from different source, creating automated pipelines for labeling data, preprocessing data, training models with experimentation, testing AI models and deploying on the AWS server with automated custom CI/CD pipeline. URL: https://github.com/mumargis1/AI-Chatbot-for-Facebook AI Threat Analyzer

Technologies used: Python, scikit-learn, pandas, NumPy, selenium, beautifulsoup4, MongoDB, lime, predictive model, machine learning, artificial intelligence, tensorflow, regression, neural network, pytorch, Object-oriented programming (OOP), URL routing, AWS EC2, AWS Route53, Nginx, Gunicorn.

Develop an Artificial Intelligent model to scans user code and predict vulnerabilities to cyber-attacks. Data Scraping from different source, creating automated pipelines for labeling data, preprocessing data, training models with experimentation, testing AI models and deploying on the AWS server with automated custom CI/CD pipeline URL: Cybershell Solutions · GitHub

Backend REST API with Python and Django

Technologies used: Django rest framework, CICD pipeline, GitHub actions, docker, test driven development (TDD), unit testing, Object-oriented programming (OOP), URL routing.

User authentication, Creating, Filtering, and sorting objects, uploading and viewing images, test driven development (TDD), project setup, GitHub actions pipeline, Docker compose configuration, code linting checks, unit testing, configuration of database, edit user model, automate API documentation for testing and sharing, Deployment to AWS, implementation filtering. URL: https://github.com/mumargis1/Ultimate_Django_backend_project Django REST API with the Django Rest Framework

Technologies used: Python, Django rest framework, CICD pipeline, API endpoints, OOP (object oriented programming, URL routing, rest framework. Python API client, function based API views, mixins and generic API views, session authentication and permissions, user and group permissions with Django model permissions, custom permissions, token authentication, default Django rest framework settings, mixins for permissions, view sets and routers, URLs, reverse and serializers, model serializer create and update methods, custom validation with serializers, request user data and customize View query set, related fields and foreign key serializer, pagination, Django based search for model API, search engine on algolia, unified design of serializers and indices, JSON WEB Token authentication with simplejwt, login via JavaScript Client, algolia instant Search URL: https://github.com/mumargis1/cfe_DRF_Project

E-commerce backend with Django Rest Framework

Technologies used: Nested URL APIs, Serialization, Data Models, Mixins, ViewSets, Cache handling with Redis, Celery for background processes, Locust for performance testing, AWS, S3, and Cart handling. Django rest framework, CICD pipeline, GitHub actions, docker, test driven development

(TDD), unit testing, Object-oriented programming (OOP), URL routing. URL: https://github.com/mumargis1/recipe-app-api

Social Networking app using python Django framework. Technologies used: Django rest framework, CICD pipeline, GitHub actions, docker, test driven development (TDD), unit testing, Object-oriented programming (OOP), URL routing.

URL: https://github.com/mumargis1/django_project

Fast API Projects

Technologies used:

Python, FastAPI, Models, ENUM, Endpoints, SQL, DataBase handling Models, BaseModel, Data Request Authentication, CRUD Methods, DataBase Handling, GET, UPDATE, POST, PUT, DELETE Endpoint URL:

https://github.com/mumargis1/fastapi_project2

Technologies used:

Python, Fast-API, Models, ENUM, Endpoints, MySQL, DATABASE handling BaseModel, Enum Class, Fast-API, HTTP-Exception, Path, Query, GET, UPDATE, POST, PUT, DELETE Endpoints URL:

https://github.com/mumargis1/FastAPI_Project

Mann Kendall Trend analysis on raster (satellite dataset) Technologies used: R, linux, grass gis, t-grass, gdal, bash processing, r-studio URL: https://github.com/mumargis1/Raster_Trend_Analysis Urban heat island formation and its time series analysis in the mega city Technologies used: R, Python, GRASS GIS, ArcMap, OsGeo (Linux), Bash scripting URL: https://github.com/mumargis1/BASH_NDVI_calculation Analysis on Dengue Cases in Lahore and its implementation on Web using Leaflet API. Technologies used: Python, Leaflet, Geo-Server, Postgres (and PostGIS), Bash scripting Purpose of this project was to visualize the data (raster and vector as well) on web using JavaScript library i.e Leaflet. Strategies used are – creation of heat map using point data of Dengue - database Postgres (and PostGIS) creation - maintaining Geo-Server. URL: https://github.com/mumargis1/BASH_NDVI_calculation GEE (Google Earth Engine) Projects

Technologies used:

Python, GEE-Map, JavaScript

Number of Porject done in Google Earth engine:

• Calculating Trend for WaterShed

• Classification of Riyadh City using Sentinel

• Classification of Riyadh city using landsat

• Downloading Seasonal (Rabi and kharif) Mean for punjab

• NDVI_annual_mean for Punjab

• Water Mask for Chenab river

• Water Occurrence change intensity for Chenab river

• Water Occurrence change intensity for Chenab river

• WindSpeed WindDirection for Karachi City

URL: https://github.com/mumargis1/GEE_source_code



Contact this candidate