Machine Learning Data Pipeline

Location:

Fort Worth, TX

Posted:

October 04, 2023

Contact this candidate

Resume:

*** * ********* ****, #*** Southlake, Texas ***** 954-***-**** *****@*********.***

Lekan Omotoye

Summary

Lekan is a backend Data Engineering consultant. Lekan has 7 years of experience building, developing, and testing large scale distributed processing pipelines and Machine Learning platforms. Lekan has worked on large scale software applications for both large companies and start-ups. Lekan has gained valuable experience in designing, developing, deploying, and integrating into a large-scale production environment.

Technologies

Programming Languages

Python, Go, Java,

Frameworks + Platforms

NumPy, SciPy, MATLAB

Container + Orchestration

Docker, Kubernetes, Terraform

Data Analytics

Spark: {Streaming Data, Clustering, Classification, Recommendation}

Natural Language Processing: {NER, Sentiment Analysis}

NoSQL + Search Technologies

Elasticsearch,

Deep-learning Frameworks

TensorFlow, Kera’s, PYTorch

MLOps Frameworks

Seldon Core, Kubeflow Pipelines

Databases

Postgres, MySQL, MongoDB

Cloud Servers

Google Cloud,

Streaming Analytics

Kubernetes, Kafka, RabbitMQ, Akka Streams, Spark Streaming

Tools

Bazel, Gradle, Maven, Git, GitHub Actions, Circle CI, Gitlab

Operating Systems

macOS, Linux, Windows

Experience

MLOps Platforms Consultant at MavenCode

Data Pipeline Engineering

●Constructed content extraction endpoints with Fast API to parse content inside input files of varying file type using Python and Apache Tika

●Worked with client to build a useable training set to be used in the construction of a classifier.

●Built a classification system to funnel input files to their corresponding endpoints using Amazon SQS Queue, S3 bucket notifications and Elastic Container Registry

●Used terraform and Kubernetes to containerize the application and deploy it as an EKS Cluster

●Created a customizable, repeatable, and log gable process using Argo Workflow

●Analyzed real-time data and application logs to identify bottlenecks and application issues that occur in the test environment.

●Worked extensively on Kubeflow packaging and deployment, setting Kubeflow manifest with all the components needed by the Data Science team.

MLOps Platforms Consultant on Google Cloud Platform

Data Pipeline Engineering

●Automated the deployment of GKE baseline for the Kubeflow Cluster with Terraform so that the new environment can be bootstrapped easily by the DevOpSec team.

●Worked with Data Scientist to come up with a strategy for getting model to production as fast as possible, thereby improving the overall efficiency of the team.

●Created training materials to get the team up to speed with Kubeflow best practices and documentation on how to create Kubeflow pipelines for continuous deployment.

●Worked with the team to get models up on Kubeflow that runs continuously on a scheduled time interval.

●Worked extensively on Kubeflow packaging and deployment, setting Kubeflow manifest with all the components needed by the Data Science team.

●Implemented 2FA authentication and authorization to the Kubeflow cluster with OKTA as the OIDC provider so that everyone is required to authenticate before accessing their Profiles and resource in the Kubeflow environment.

●Utilized Apache TEZ for batch processing.

●Utilized GCP storage buckets for ETL pipelines.

Google PSF Project / TylerTech

MLOps Platforms Consultant at MavenCode

Data Pipeline Engineering

●Used terraform to build out the GCP data pipeline for processing invoices with Google Document AI.

●Used Python, GCP cloud functions, storage buckets, and Pub/Sub to build out the pipeline.

●Responsible for developing test cases using functional requirements.

●Implemented Scheduler service to kick off Google Cloud Composer implementation pipeline that runs the workflow

●Cloud function Event processing on Google Storage bucket

Google PSF Project / Scheels

MLOps Platforms Consultant at Mavencode

Data Pipeline Engineering

●Used terraform to build out the entire GCP data pipeline on Google Cloud Infrastructure

●Implemented Python AI code to intelligently extract emails with invoices received in a GSuite inbox account.

●Used Google Invoice AI to extract data from incoming emails.

●Created hourly batch jobs to enrich inventory catalog with newly received invoices.

●Used Python, GCP cloud functions, storage buckets, BigQuery, and Pub/Sub to build out the pipeline.

Google Cloud - Partner Service Consulting

MLOps Platforms Consultant at MavenCode

Data Pipeline Engineering

●Worked with Data Scientist and ML to create a robust, elastic, and scalable platform for running and deploying Machine Learning experiments for modeling various use-cases.

●Created CI/CD pipelines with GitHub Actions to automate the deployment and destruction of Kubeflow in GCP.

●Implemented and Operationalized ML workflow pipelines on Kubernetes with Kubeflow

Education

Bachelor of Sci, Lagos State University

NIGERIA,

Certifications

●Google professional Machine Learning Engineer

●Google Professional Cloud Architect Engineer

●Google Professional Cloud Database Engineer

●Google Professional Cloud DevOps Engineer

Conferences

●Co-organizing ML meetup with the MavenCode team in the DFW Area

Contact this candidate