Description
We have an opportunity to impact your career and provide an adventure where you can push the limits of what's possible.
As a Lead Software Engineer at JPMorganChase within the Consumer & Community Banking Digital Cloud team, you are an integral part of an agile team that works to enhance, build, and deliver trusted market-leading technology products in a secure, stable, and scalable way. As a core technical contributor, you are responsible for conducting critical technology solutions across multiple technical areas within various business functions in support of the firm’s business objectives.
Job responsibilities
Executes creative software solutions, design, development, and technical troubleshooting with ability to think beyond routine or conventional approaches to build solutions or break down technical problems
Develops secure high-quality production code, and reviews and debugs code written by others
Identifies opportunities to eliminate or automate remediation of recurring issues to improve overall operational stability of software applications and systems
Leads evaluation sessions with external vendors, startups, and internal teams to drive outcomes-oriented probing of architectural designs, technical credentials, and applicability for use within existing systems and information architecture
Design and develop a scalable ML platform to support model training, deployment, and monitoring
Build and maintain infrastructure for automated ML pipelines, ensuring reliability and reproducibility supporting different model frameworks and architectures
Set up monitoring and reliability for both infrastructure and models utilizing Prometheus and Grafana
Code infrastructure with Terraform and utilizing Python for automation
Perform DevOps in Kubernetes (K8s), Docker, Helm, GitOps, and CI/CD pipelines (Jenkins, GitLab CI)
Required qualifications, capabilities, and skills
Formal training or certification on software engineering concepts and 5+ years applied experience
8+ years of hands on software/platform engineering experience, including leading cloud native delivery for business critical systems.
Expert Infrastructure as Code with Terraform (modules, state backends, workspaces, CI integration, policy controls).
Expert proficiency in Python for platform automation, tooling, and systems scripting; familiarity with Bash/YAML/Helm.
Deep experience with CI/CD (e.g., Jenkins, Spinnaker/Argo), artifact management, and automated testing strategies.
Strong AWS/public cloud knowledge (VPC, ALB/NLB, ECR/EKS, IAM, KMS, CloudWatch/CloudTrail) and cloud networking fundamentals.
Experience with MLOps tools and platforms (e.g., MLflow, Amazon SageMaker, Google VertexAI, Databricks, BentoML, KServe, Kubeflow)
Understanding of data versioning and ML models lifecycle management
Practical experience applying agentic AI/LLM capabilities to DevSecOps use cases (e.g., assisted troubleshooting, code/IaC generation with review, runbook automation) with attention to accuracy, guardrails, and auditability.
Containerization & DevOps: Expert skills in Kubernetes (K8s), Docker, Helm, GitOps, and CI/CD pipelines (Jenkins, GitLab CI).
Monitoring & Reliability: Experience setting up monitoring for both infrastructure and models (drift detection, model accuracy) using Prometheus/Grafana.
Preferred qualifications, capabilities, and skills
Experience deploying models using Canary, Blue/Green, or Shadow deployment strategies
Previous experience deploying & managing ML models is beneficial
Experience working in a highly regulated environment or industry
Strong knowledge of AWS, Azure, or GCP, including serverless architectures, storage solutions, and network configuration.
Postgres experience