We are looking for a Machine Learning Engineer to join our team to help build systems that accelerate the development and deployment of machine learning models, especially large language models (LLMs). You will partner closely with Machine Learning researchers and internal users to understand requirements and apply your own domain expertise to build high performance and reusable APIs.
The ideal candidate is someone who has strong ML fundamentals and can also apply them in real production settings. In particular, this role has a core focus on optimizing inference and fine tuning for LLMs. They should also be comfortable with infrastructure and large scale system design, as well as diagnosing both model performance and system failures.
You will:
Responsibilities
Architect/Enable distributed compute aligning workloads to Small/Mid/High end GPUs.
Leverage appropriate storage hardware and data formats to improve read/re-read efficiency.
Identify and remediate latency contributors esp. IO bottlenecks, Inefficient Data shuffling, under/over utilized compute.
Scale models by employing Distributed training using Data / Model Parallelism techniques. Parallelize inference processing to improve prediction latency.
Provide Subject Matter Expertise in Graph and Vector databases for a variety of use cases that include Knowledge Graphs, RAG etc.
Implement LLM observability and monitoring solutions.
Required Education and Experience
Degree in Computer Science or Engineering
Prior Experience with:-
Docker, Kubernetes, and containerization.
Distributed systems.
Databricks ML
Machine Learning Engineering
Cloud (Azur Preferred)
Expert level – Python, SQL
Preference will be given to candidates who in addition to required experience have:-
Experience/Expertise with LLM Fine tuning, LLM Ops, Model Evaluation and Prompt Engineering
Experience (or knowledge of) Mosaic ML, Ray Framework.
Experience with Lang Chain or LlamaIndex
Experience with any vector database.
Job Specifications:-
Authorities, Impact, Risk
Influence the data, AI and cloud journey for the bank. Influence the Sustainability roadmap for the bank.
Impact
Revenue generation thru New Business for Alternative Data
Innovation
6 years of AI, Big Data and cloud expertise
3-4 years of Alternative data experience
Risk
Mitigate reputation risk thru AI driven Data Quality to ensure highest quality data and services are offered to clients
Mandatory Skillsets:-
2+ years of experience building machine learning training pipelines or inference services in a production setting.
Experience with LLM deployment, fine tuning, training, prompt engineering, etc.
Experience with LLM inference latency optimization techniques, e.g. kernel fusion, quantization, dynamic batching, etc.
Experience with CUDA, model compilers, and other model-specific optimizations.
Preferred
Experience working with a cloud technology stack (eg. Azure or AWS).
Experience building, deploying, and monitoring complex microservice architectures.
Experience with Python, Docker, Kubernetes, and Infrastructure as code (e.g. terraform).
Experience with LLMs, MLops
Experience with distributed notebook environments like Databricks
Experience building AI driven Data Quality frameworks and other data governance tools and capabilities
Experience building meta data driven AI and statistical models for repeatable insight generation
Experience building front to back data pipelines comprising of data ingestion, enrichment, data quality, Analytics and reporting
Experience with Agile development methodology
Experience with company KPIs and back testing of alternative data factors against company KPIs.
Experience with NLP techniques and transfer learning frameworks like BERT
Experience with using HuggingFace Model Artifacts