Post Job Free
Sign in

Senior Platform Engineer Cloud & Databricks Architect

Location:
Chennai, Tamil Nadu, India
Posted:
April 29, 2026

Contact this candidate

Resume:

KUMARESAN PALANISAMY

Platform Engineer • Cloud Infrastructure & DevOps Architect

*************@*****.*** • +1-754-***-**** • 20+ Years Experience

PROFESSIONAL SUMMARY

Versatile Senior Platform Engineer with 20+ years of experience spanning Azure Databricks platform administration, cloud infrastructure engineering, Development, Support and enterprise DevOps. Equally adept at building and governing enterprise data platforms as designing and automating large-scale cloud infrastructure across Azure and AWS. Proven ability to deliver secure, scalable, and cost-optimized platforms for AI/ML, data engineering, and enterprise operations teams across complex multi-account environments.

CORE COMPETENCIES

DATABRICKS & DATA PLATFORM

CLOUD INFRASTRUCTURE & DEVOPS

• Databricks Workspace & Cluster Admin

• Azure Compute: VMs, VM Scale Sets, AKS, App Services, Functions

• Unity Catalog — RBAC/ABAC, Lineage, Masking

• Azure Networking: VNet, Private Endpoints, NSG, Load Balancer, ExpressRoute

• Delta Lake, Medallion Architecture (B/S/G)

• Azure Storage & Data: ADLS Gen2, Blob Storage, Azure SQL, Cosmos DB

• Databricks Asset Bundles (DABs), Repos & CI/CD

• Azure Identity & Security: AAD, Managed Identities, Key Vault, Azure Policies, RBAC

• MLflow, Azure ML, Model Monitoring & Retraining

• Azure Monitoring: Azure Monitor, Log Analytics, App Insights, Alerts

• PySpark, Spark SQL, ADF, Structured Streaming

• Azure DevOps, GitLab CI/CD, GitHub Actions, Jenkins

• Delta Sharing, Service Principals, SQL Warehouses

• AWS: Organizations, IAM, Lambda, CodeBuild, CloudFormation, EKS, EC2, SSM

• Databricks Observability & Guardrails

• IaC: Terraform, CloudFormation, Ansible, Puppet, Chef, BMC BladeLogic

• Privacera, Collibra — Governance & Compliance

• Kubernetes & AKS: Cluster Provisioning, Node Pools, Auto-scaling, Namespace Isolation, Deployments, Services, Ingress, Helm, IRSA

• ADF Orchestration, REST API Integration

• Observability: Prometheus, Grafana, Dynatrace, Splunk, Datadog, CloudWatch, Nagios, Zabbix, TrueSight

• AI/LLM: Azure OpenAI, RAG, Agentic Workflows

• MLOps & AI Governance: MLflow, Model Lifecycle, AI Observability, Enterprise AI Guardrails

CERTIFICATIONS

Microsoft Azure AZ-900 (2025) • Microsoft Azure AI-900 (2025) • Microsoft Azure DP-900 (2025) • AWS Solutions Architect Associate (Dec 2025)

PROFESSIONAL EXPERIENCE

CGI Senior Consultant — Platform Engineer (Azure Databricks & Cloud Infrastructure)

Aug 2019 – Jan 2026

Tech Stack: Azure Databricks, Unity Catalog, ADF, ADLS Gen2, AKS, Azure OpenAI, MLflow, Azure ML, Azure DevOps, Terraform, AWS Organizations, Lambda, CodeBuild, CloudFormation StackSets, SSM, EventBridge, EKS, Python, PowerShell, Ansible, Puppet, Log Analytics, TrueSight, VMware, Nutanix, GitLab, Jenkins

Databricks Platform Engineering

•Provisioned and managed enterprise Databricks workspaces on Azure; enforced cluster policies, pool management, spot instance usage, auto-termination, and resource quotas to optimize performance and prevent cost overruns.

•Automated admin tasks via Python notebooks and Databricks REST API — workspace provisioning, cluster health checks, asset pruning, privilege reviews, workflow tags, and naming convention enforcement.

•Defined Unity Catalog security boundaries with catalog, schema, table, and column-/row-level access controls; implemented RBAC/ABAC patterns, Service Principals, SQL Warehouses, and Delta Sharing for internal and external teams.

•Applied data masking and SQL/Python UDFs for PII protection; integrated Privacera and Collibra for metadata management, lineage tracking, and compliance; managed service principal tokens and regular access reviews.

•Architected Lakehouse pipelines using Medallion Architecture (Bronze/Silver/Gold) with batch and streaming ingestion from ADLS Gen2; built ADF and Databricks Jobs workflows for orchestration and REST API-driven data integration.

•Operationalized ML pipelines using MLflow tracking, model registry, and Azure ML; implemented model monitoring, drift detection, automated retraining triggers, and MLOps CI/CD governance with unit tests and deployment approvals.

•Managed workspace resources, clusters, and permissions as code using Terraform and the Databricks Terraform Provider; deployed notebooks, libraries, and jobs via Azure DevOps and GitLab CI/CD using Databricks Asset Bundles (DABs).

•Integrated Databricks Repos with Azure DevOps for Git-based version control, enforcing policy-based branching, peer reviews, and deployment approvals across all notebook and workflow changes.

•Implemented operational guardrails — job retries, SLA alerting, cost threshold alerts, lineage integration, and cluster policy enforcement; built Databricks SQL dashboards and Azure Monitor alerts for workspace health and pipeline SLOs.

•Enforced data quality guardrails within pipelines using expectation checks, reconciliation jobs, and automated quarantine of bad records before promotion to Silver/Gold layers.

AI Platform Engineering

•Designed and implemented LLM platform architecture on Azure using Azure OpenAI Service; configured model deployments, managed API access, token quotas, and security controls aligned with enterprise governance standards.

•Built retrieval-augmented generation (RAG) pipelines integrating Azure OpenAI with enterprise data sources via vector search and semantic retrieval, enabling context-aware AI responses grounded in organizational knowledge.

•Developed agentic AI workflows orchestrating multi-step reasoning, tool use, and decision logic using AI orchestration frameworks; deployed pipelines on Azure with scalable, fault-tolerant infrastructure.

•Implemented model lifecycle governance using MLflow — tracking experiments, versioning models, managing promotion workflows across dev/staging/prod, and enforcing champion/challenger evaluation patterns.

•Established enterprise AI guardrails including content filtering, prompt injection controls, output validation, and compliance policies; configured AI observability dashboards to monitor model performance, drift, and usage patterns in production.

Cloud Infrastructure & DevOps

•Automated Azure IaaS/PaaS infrastructure including VM scale sets, AKS, ADLS Gen2, Key Vault, Private Endpoints, and Managed Identities using Terraform and Ansible; secured workloads with Azure Policies, RBAC, and network isolation controls.

•Drove observability using Prometheus, Grafana, Dynatrace, Splunk, Azure Monitor, Log Analytics, and TrueSight; delivered rapid PoCs for new Azure capabilities and mentored junior engineers on platform best practices.

•Provisioned and managed AKS and EKS clusters using Terraform and Ansible; configured node pools, auto-scaling, HA, and secure cluster networking with Private Endpoints and VNet integration for enterprise-grade workloads.

•Managed core Kubernetes objects — Deployments, StatefulSets, Services, Ingress, ConfigMaps, Secrets, and Namespaces — applying compliance-driven controls, resource quotas, and limit ranges across teams and environments.

•Implemented IAM Roles for Service Accounts (IRSA) for fine-grained pod-level AWS permissions; deployed and managed applications using Helm charts with environment-specific value overrides and GitOps-based release workflows.

•Provisioned and governed 40+ AWS accounts via AWS Organizations with dev/test/prod isolation; engineered CrowdStrike CSPM and Falcon sensor automation using Lambda, EventBridge, CodeBuild, CloudFormation StackSets, and SSM.

•Implemented least-privilege IAM governance with cross-account access patterns; integrated CloudWatch, CloudTrail, and SNS for centralized logging, auditing, and proactive alerting across all organizational accounts.

•Designed end-to-end CI/CD pipelines using Azure DevOps, GitLab, Jenkins, and GitHub Actions; architected IaC via Terraform and CloudFormation for repeatable, version-controlled Azure and AWS infrastructure provisioning.

•Developed serverless automation using AWS Lambda for event-driven provisioning and operational workflows; scripted platform automation in Python, PowerShell, Shell, and Groovy; enforced Git-based branching strategies and deployment governance.

EARLIER EXPERIENCE

Accenture Ltd, UK Technical Architecture Delivery Specialist

Mar 2019 – Jun 2019

•Built CI/CD pipelines using Chef and Jenkins DSL Groovy; managed infrastructure-as-code and supported Agile developer workflows on Azure with SonarQube quality gates and ServiceNow change management.

DXC Technologies, India Technology Consultant

Jul 2018 – Mar 2019

•Led a DevOps team supporting a healthcare Azure platform integrated with Ansible, ELK, and Jenkins; authored operational agreements, runbooks, and environment management documentation.

NICE Systems, India Specialist DevOps Engineer

Aug 2016 – Jul 2018

•Delivered continuous deployment on AWS using Terraform, Docker, and Puppet; managed customer infrastructure onboarding for managed SaaS; authored custom monitoring plugins using Python and Shell.

CGI Information Systems, India Senior Systems Engineer

Aug 2012 – Aug 2016

•Implemented CI/CD with Jenkins and Puppet; administered BMC monitoring and Control-M job scheduling; automated VPC provisioning and AWS scaling; deployed knowledge modules for MSSQL and Oracle databases.

DELL International, India Software Development Senior Analyst

Aug 2010 – Aug 2012

•Performed full-cycle software development — requirements analysis, design, coding, unit testing, and release documentation — using Perl, Shell, Oracle PL/SQL, JavaScript, and ASP on Unix/Windows.

EDUCATION

Master of Science (2002–2004) Bharathidasan University, Tamil Nadu, India

Bachelor of Science (1999–2002) Bharathidasan University, Tamil Nadu, India



Contact this candidate