Post Job Free
Sign in

Data Science Machine Learning

Location:
Houston, TX
Posted:
November 26, 2024

Contact this candidate

Resume:

.

.

TEFERA ESHETE

*******@*****.*** / 346-***-****

**** ********* ***, ****, ** 77494

Data Science professional with broad experience in Data Wrangling, Modeling, Machine Learning, Machine Learning Operations (MLOPs), Large Language Model (LLM), RESTful API Development, Artificial Intelligence, Data Analytics, Data Visualization, Interpretation, Communication, Recommendation, Cognite Data Fusion, ETL, AWS, Software Testing and Development, Problem-solving and troubleshooting, and Teamwork. Very well experienced in Python, SQL, PowerBI, Dataiku, Amazon SageMaker, IBM Watson, and Excell. EXPERIENCE

• SCHLUMBERGER Houston Texas Data Scientist, Machine Learning Feb 2023 – May 2024

Data preparation, management, QC using data visualization techniques and archive: Omega, Excel, Python, SQL, PowerBI, Linux

DATAIKU MLOPS PROJECT (worked on Credit card transactions, Facies classification and detection, High revenue analysis)

designed robust set of metrics, checks, data quality rules, scenarios, triggers, and reporters on the Design node.

Optimized pipelines for readability, maintainability, and computational efficiency.

applied several refactoring and optimization techniques

Created, deployed, and versioned project bundles via the Project Deployer

Built scenarios to monitor projects running on an Automation node

Created a project bundle and deployed it to an Automation node via the Deployer.

Published and redeployed new bundle versions.

Created a scenario that can automatically update a batch deployment.

Deployed API services via the API Deployer

Managed multiple versions of an API service

Monitored the responses of API endpoints by using Dataiku’s Event Serve

Organized Dataiku items and begin the governance process

Executed item workflows and manage governed projects

Deployed model versions and bundles while tracking model metrics and drift

DATAIKU MACHINE LEARNING:

Conducted High Revenue Analysis:

Conducted Exploratory Data Analysis

Tested Random Forest & Logistic regression models

Deployed Random Forest model based on ROG-AUC Score

Worked on sales Time Series analysis:

Worked on univariate, multivariate and multi dataset

Prepared, Modelled and Analysed (descriptive, explanatory, forecasting, & control).

Conducted Natural Language Processing: Prepared the data, built models(Random Forest & Logistic Regression), chose Logistic regression based on its performance, anditeratively improve its performance through text cleaning, and feature engineering; deployed the model and evaluated.

Created and deployed portioned model: partitioned a transportation dataset, trained prediction models (Random forest and Ridge regression) on each partition, chose Random forest and deployed.

Classified and predicted facies based on their physical characteristic

Prepared input data, visualized data, trained and evaluated ML model, generated features, retrained ML model, performed scoring, created dashboard and Dataiku application.

COGNITE DATA FUSION PROJECT:

Worked on Cognite Data Fusion architecture and implementations.

.

.

Worked on Industrial data, industrial data platforms, APIs and integrations, visualizations, and dashboards.

• Cresent Geo – Houston, Texas

Data Pre-Processing and Modelling consultant Jan 2021 – Jan 2023

DATAIKU MLOPS PROJECTs:

worked on Credit Card Fraud Detection, Natural Language Processing, Loan application dataset

Conducted Geospatial Data Analysis & Remote Sensing

Worked on cities and population size data, information on distance between cities, merchant and client, area, average household incomes, and created various maps. Also compared information from cities, Polygons of all countries, Global rivers, Global historical temperature, active fire maps, and Land cover data.

• Presented results in vector data and raster data.

• used

• Python (Fiona, GeoPandas, Geoplot, Pysal, Folium, Shapely, Matplotlib in IBM Db2 for I and IBM Watson) and Xarray, Rasterio, and SQL.

• Dataiku Geospatial Analytics

• Amazon SageMaker

• Worked on Data analysis of satellite image for weather information, climate change, Crop data, drought conditions and water and moisture detections.

E.g. Weather station dataset

• cluster the station using DBSCAN based on location, mean, max and min Temperature)

o Find the group of stations which show similar weather condition.

• Extracted and analyzed geospatial data from various sources: Natural Earth, The World Bank, US Census Buraeu TIGER, NOAA Weather Data, ESRI: ArcGIs Living Atlas, Open Street Map (OSM), CRU and NASA

• Conducted Network Intrusion Detection

• Loaded the data

• Conducted exploratory analysis

• Conducted scaling numerical attributes

• encoded categorical attributes

• Conducted feature selection

• Conducted dataset partition

• Conducted fitting models

• evaluated and validated models

• Naive Baye Classifier

• Decision Tree Classifier

• KNeighborsClassifier

• LogisticRegression

• COGNITE DATA FUSION PROJECT:

Connected to various data sources

Explored, analyzed, and identified trend in data

Modeled, integrated, and transformed data.

• Loan application dataset.

• cleaned the dataset. • Build models using K-Nearest Neighbor, Decision Tree, Support Vector Machine, Logistic Regression • Find the accuracy of each algorithm. • Predict whether a loan case will be paid off or not. • choose the best algorithm/classifier

.

.

• Customer dataset

• applied customer segmentation with K-means (preprocess, created model (partition customers into clusters). • Identified the specific groups of customers and effectively allocate marketing resources.

• Credit Card Fraud Detection

• Importing Libraries and Dataset • Data Processing • Splitted data for model building (Train and Test Data) • Model Building: • Tried K-Nearest Neighbors, Decision Tree, Logistic Regression, Support Vector Machines, XGBoost, and Random Forest: Defined the model, Checked the accuracy, Check the confusion matrix. Based on the comparison of the accuracy score chose XGBoost.

• Neural Networks and Deep Learning Projects (DeepLearning.AI):

• Building basic functions with NumPy

sigmoid functions, sigmoid gradient, reshaping arrays, normalizing rows, vectorization

• Building logistic regression classifier to recognize images

initializing parameters, forward propagation, calculating the cost function, backward propagation, updating parameters using gradient descent, merge functions to a model, plotting the cost function and gradients, examine learning rate, choose the learning rate that better minimize the cost function, Predict

• Planar Data Classification with one hidden layer: • Load the dataset, visualize. • initialize the model parameters. • forward propagation • compute the cost. • backward propagation • update parameters using gradient descent. • build neural network model. • predict with the model.

• Building of 2-layer neural network and deeper neural network with many layers: • Initialized parameters of 2-layer neural network • initialized parameters for an L-layer neural network. • applied forward propagation • computed cost. • applied backward propagation • update parameters. • trained the model. • used trained parameters to predict labels and see the accuracy. • applied to images to distinguish the different images

• TGS Data Scientist Houston Texas Oct 2008 – Nov 2020

Data preprocessing, modeling, presentation using data visualization techniques, interpretation, and recommendation.

Team leader/supervisor, mentor, and project manager (project bidding and proposal writing, controlling cost, quality and time by properly using available resources)

Written many reports.

SKILLS/CERTIFICATES

• MLOps Practitioner Certificate (Dataiku, 2024)

• MLOps with AWS Certificate (Udemy, 2024)

• Developer Certificate (Dataiku, 2024)

• DELFI Data Science Practioner Certificate (SLB 2024)

• Microsoft Power BI for Analysts (PLURALSIGHT, 2024)

• ML Practioner Certificate (Dataiku, 2023)

• Dataiku & SLB - Core Designer Certificate (Facies Classification, 2023)

• Advanced Designer Certificate (Datiku, 2023)

• Core Designer Certificate (Dataiku, 2023)

• Cognite Data Fusion Fundamentals Certificate (Cognite Academy, 2023)

• Industrial Data Fundamentals Certificate (Cognite, Industrial Digital Academy, 2023)

• DELFI Cloud Security and infrastructure (SLB, 2023)

• Neural Networks and Deep Learning Certificate (DeepLearning.AI Oct 2021)

.

.

• Certificate in Software Testing and Automation Specialization (University of Minnesota, 2022)

• AWS Fundamentals Specialization Certificate (AWS, 2022)

• COVID19 Data Analysis Using Python Certificate (Coursera Feb 10, 2022)

• COVID19 Data Visualization Using Python Certificate (Coursera, Feb 10, 2022)

• Machine Learning Certificate (Stanford University, Feb 2022)

• Applied Data Science with Python Specialization (University of Michigan, Apr 2022)

• IBM Data Science Professional Certificate (IBM; Sept 2021)

• Tableau; Enterprise Resource Planning and Management

• Proficient in data processing software and programming languages: Python, SQL, PowerBI, Tableau, Git, AWS SageMaker, UNIX/LINUX, C, Matlab, Octave, Excel, PowerPoint, Omega

• Cognite Data Fusion, ETL, Big data, Project management and team leading, software testing and development.

• Solid background in statistics, maths, and computer science.

• Excellent communication skills, Teamwork, and Client service. EDUCATION

• University of Texas At El Paso - El Paso, TX PhD Geophysics 12/2001



Contact this candidate