Leonid Ganeline, BSc, MSc Vancouver, B.C. Canada
***.***.**@*****.***
Machine Learning, Data Science
Data exploration and experimentation, model training and fine-tuning. Experience in Natural Language Processing (NLP) and Anomaly Detection. Expertise in building ML teams. Proficient in Python, SQL, and Cloud ML. LinkedIn GitHub
EXPERIENCE
Senior Machine Learning Engineer, Stealth startup, Vancouver, 6/2023 – present Created the LLM-based chatbot that implements a retrieval augmented generation (RAG) chain. Integrated it with the local document storage and online documents. Experimented with different chunking, embeddings, vector stores, LLMs and prompts to get more accurate outcomes. Created chains based on the local artifacts and chains based on SaaS models. LLM models: Mixtral, Phi, ChatGPT. Tools: LangChain, Chroma, Gradio, Hugging Face pipelines. Contributor to the LLM open-source projects:
● LangChain: Building applications with LLMs (in top-10 contributors, 7/3000+). Personal recommendations from Harrison Chase, CEO at LangChain.ai
● facebookresearch/ImageBind: One embedding space
● Chroma: The vector store
Reviewer of the Generative AI with LangChain: Build large language model (LLM) apps book. Senior Machine Learning Engineer, Team Lead, Tigera.io, Vancouver, 10/2020 – 5/2023 Designed and developed the anomaly detection framework for the Calico products. Hired and led a team of three ML Engineers.
Implemented 30+ models with daily retraining, automated hyperparameter tuning, and the unsupervised evaluation regime. Designed the anomaly interpretability functionality. Performed data exploration. Productized models into the Kubernetes clusters:
● Classification NLP models based on the Catboost and tokenizers, with novel feature engineering
● Time-series models based on the Gluon-TS neural networks
● Isolation Forest and LOF clustering models
● Ensemble clustering models
Tools: Python, PyTorch, Gluon-TS, Sktime, scikit-learn, transformers, Catboost, Pandas, NumPy, MLflow, Poetry, Pydantic, FastAPI, Elasticsearch, BigQuery, Docker, Kubernetes, and Linux. Management: GitHub, Polyaxon, Jira. Senior Machine Learning Engineer, Team Lead, SkyHive, Vancouver, 5/2018 – 10/2020 SkyHive was named one of the top 25 ML startups to watch on Forbes. Led and hired the ML Engineering team of 6 people. Initiated data science and machine learning projects. Created and owned the entire Machine Learning technology stack, from data exploration to production. Developed the "Skill Extraction" project, which searches for skills in job descriptions and resumes. Improved it 2 years in 6 iterations. Increased the processing speed from 40 sec per document to 0.02 sec. Increased precision from 0.5 to 0.9. Trained the word2vec and ELMO models for NER, classification and text similarity. Established workflows for data labelling, model evaluations, and regression testing. Organized labeling and evaluation of the training data sets with Amazon Mechanical Turk. Implemented REST services, deployed with Azure DevOps pipelines and Kubernetes in Azure, Google Cloud, and AWS.
Tools: Python, Keras, PyTorch, scikit-learn, pandas, gensim, spaCy, flair, fastText, MongoDB, MySQL, Git, Docker, Kubernetes, AWS Lambda, and Linux. Management: Azure DevOps, Jira. Machine Learning Developer, Altyn Consulting, Vancouver, 10/2016 – 5/2018 Trained models to predict ship itineraries in Vancouver Port waters. Preprocessed time series into Markov Chain samples. Developed CNN models for predicting rail cross-closures. Implemented a service to count tracks and cars from Vancouver Port using web cameras. Architected a project to analyze operation logs from the server cluster to detect anomalies and security breaches. Tools: Python, Keras, Tensorflow, PyTorch, Scikit-learn, XGBoost, lightGBM, catboost, and nltk. Integration Consultant on multiple projects 2005 – 10/2016 Various roles in Software Development, Integration Architecture, and Systems Integration. For example, a project for Los Angeles Superior Court, a project for Servus Credit Union, and projects for Port Metro Vancouver. For more details, see my LinkedIn profile and my Microsoft MSDN profile. Projects in Industries: IT, Aerospace, Job Market, Travel, Communication, Manufacturing, Healthcare, Financial, Real Estate, Advertising, and Justice.
The Microsoft Most Valuable Professional [MVP] Awardee in Microsoft Azure for 10 years (2007-2016). Microsoft recognizes me as an independent expert in integration technologies. Development stack: Microsoft .NET, C#, BizTalk Server, EDI, SQL, XML, XSD, WSDL, SOAP, XSLT, and REST. SKILLS
● Machine Learning frameworks: PyTorch, Tensorflow, Keras, and MXNet
● Machine Learning packages: transformers, spaCy, Scikit-Learn, xgboost, gensim, flair, catboost, and LangChain
● Languages: Python, C#, and C
● Neural Networks: Convolutional, Recurrent, Autoencoders, LSTM, ELMO, fastText, Transformers, and LLMs
● Machine Learning areas: NLP, image recognition, anomaly detection, and prompt engineering
● Data Processing: SQL, BigQuery, Scikit-Learn, Pandas, Numpy, feature-engine, Faiss, and Spark
● Development Tools: Jupyter Lab, Azure ML Studio, and PyCharm
● Cloud ML: AWS SageMaker, GCP Vertex AI, Azure MLOps, Azure OpenAI, HuggingFace
● DevOps, CI/CD: Docker, Kubernetes, Git, GitHub, MLflow, Azure DevOps, and Poetry
● Integration: BizTalk Server, FastAPI, Azure EventHub, Azure ServiceBus, MSMQ, RabbitMQ, SOAP, and REST EDUCATION
Samara State Aerospace University, Russia,
Bachelor’s and Master's Degrees in Electronic Engineering (Signal Processing), diploma with honours ACCOMPLISHMENTS
Microsoft Most Valuable Professional [MVP] Award 2016 in Microsoft Azure Microsoft Most Valuable Professional [MVP] Awards 2013–2015 in Microsoft Integration Microsoft Most Valuable Professional [MVP] Awards 2007–2012 in BizTalk Server PUBLICATIONS
See publications in InfoQ and Microsoft TechNet.