Software Engineer Senior

Location:

Palo Alto, CA

Salary:

120k-130k

Posted:

September 15, 2025

Contact this candidate

Resume:

VRUNDA PATEL

**********@*****.*** 316-***-**** LinkedIn

Professional Summary:

Senior Software Engineer with 6+ years of experience in Python development, AWS cloud-native applications, and containerized microservices. Skilled in full-stack development using React and TypeScript, with hands-on expertise in Kubernetes deployments, CI/CD automation, and infrastructure as code. Active TS/SCI Clearance. Experienced in building secure, scalable systems with a strong focus on automation, observability, and mission-critical reliability.

Built RESTful and event-driven services using FastAPI and Flask, integrating SQLAlchemy, Redis, and Celery to support async-safe ingestion, background task execution, and data transformation. Refactored monolithic services into modular APIs with scoped routing, validation layers, and retry-safe workflows for real-time and batch systems.

Provisioned and managed cloud infrastructure using Terraform across ECS, Lambda, and EventBridge. Designed S3-triggered pipelines, used X-Ray and CloudWatch for tracing, and implemented IAM-scoped roles for secure service communication. Automated backup, deployment, and failover using boto3, S3 versioning, and region-specific fallback logic.

Proficient in Kubernetes-based deployments, managing workloads through manifests, Helm charts, and kubectl. Experienced in configuring GPU-backed nodes for ML inference jobs, tuning pod scheduling, monitoring with Prometheus, and validating cluster rollouts across staging and production environments.

Built Angular and React dashboards for operations and analytics use cases, integrating Redux, WebSocket, and GraphQL for state management and real-time updates. Used Tailwind CSS and Material UI to create dynamic UI components with filter preservation, form validation, and chart-based status monitoring.

Extensive experience with data validation and schema management, applying Pydantic, Marshmallow, Alembic, and Great Expectations for schema enforcement, type safety, and migration planning. Managed PostgreSQL, Redshift, and DynamoDB schemas with versioning, clustering, and advisory locking for safe schema evolution.

Hands-on experience with Apache Airflow for building and scheduling ETL pipelines, integrating SLA sensors, failure handling, and conditional routing of high-priority datasets. Developed pipelines for compliance datasets, telemetry ingestion, and event scheduling with integrated Slack and CloudWatch alerting for operational monitoring.

Worked with PostgreSQL, DynamoDB, and Redshift, implementing schema partitioning, timestamp-based indexing, and transactional workflows. Managed migrations using Alembic and Terraform-backed DDL changes. Applied advisory locks and schema diffs to control concurrent updates and prevent data integrity issues.

Built Airflow and Glue pipelines to support both batch and streaming ingestion. Applied conditional branching, SLA monitoring, and schema enforcement using Great Expectations. Coordinated ingestion from S3, DynamoDB, and external APIs into warehouse layers with transformation logic and downstream trigger integration.

Created CI/CD pipelines with GitHub Actions and Jenkins to run test suites, enforce linting, verify schema contracts, and deploy container builds. Scheduled cron and Lambda jobs for data syncing, batch exports, and retry logic. Automated snapshot creation, tagging, and archival with boto3-based workflows.

Implemented unit and integration testing using Pytest, Freezegun, and Factory Boy. Added contract testing using Pact, and schema validation using jsonschema and Pydantic. Integrated test coverage tracking and failure logging into CI workflows with Slack and X-Ray alerting on failure regression

Participated in sprint planning, architecture reviews, and post-release retrospectives. Tracked deployment outcomes, version tags, and rollback procedures across QA, staging, and production environments. Coordinated cross-team documentation and managed release artifacts aligned with OpenAPI standards.

Professional Experience:

Python Developer

Rocket Lab Oct 2023 to Present

Built FastAPI ingestion services to process telemetry from launch systems, using async-safe routing, Redis queues for concurrency control, and Pydantic for validation. Added retry logic and advisory locking to prevent data duplication and race conditions in real-time ingestion scenarios.

Designed and deployed Kubernetes manifests (Deployments, Services, ConfigMaps) for ingestion and telemetry processing workloads, tuning pod resource allocation for GPU-backed compute nodes and ensuring stability of long-running ML inference jobs.

Developed Airflow pipelines for sensor data analysis and launch event scheduling, integrating SLA-based alerting, task-level failure tracking, and conditional logic to route high-priority telemetry into separate validation queues monitored via Slack and CloudWatch.

Partnered with platform engineers to extend Helm charts and define environment-scoped values for mission-critical services. Validated upgrades across staging and production clusters, troubleshooting rollout failures using kubectl, logs, and Prometheus metrics.

Automated snapshot workflows using boto3 for nightly backups of launch metadata stored in S3. Implemented region-specific redundancy, version tagging, and archive lifecycle policies to ensure recovery capabilities and traceability during incident review or audit events.

Managed infrastructure using Terraform to provision ECS workloads and Lambda functions for ingestion and alerting services. Applied environment-based overrides, scoped IAM roles with least privilege, and included rollback plans for high-risk deployments across QA and production.

Implemented system-wide observability with AWS X-Ray and CloudWatch metrics. Instrumented ingestion pipelines to detect throughput drops, high-latency events, or downstream unavailability, enabling faster root cause analysis and improving recovery times during failures.

Created JSON-based structured audit logs for all telemetry ingestion events. Logs included timestamp offsets, source identifiers, and status codes, allowing analysts to replay failed sequences and track state transitions across ingest-to-warehouse workflows.

Designed Redshift and DynamoDB schemas for mission control datasets. Added time-partitioned clustering, hash-based record keys, and schema migration tooling using Alembic. Enabled backward-compatible schema changes and diff checks during every release cycle.

Built React dashboards with Tailwind CSS and Redux to visualize mission progress, ingest pipeline status, and system health indicators. Implemented WebSocket channels for real-time status refresh, filterable mission logs, and drill-down views for telemetry anomalies.

Automated cluster-level observability with CloudWatch + Prometheus exporters for node health, pod scheduling delays, and GPU utilization. Configured alert thresholds to support proactive triage during mission event workloads.

Integrated S3 event triggers and EventBridge rules to decouple ingest services from downstream consumers. Enabled parallelism and resilience across processing units by implementing task deduplication, error routing, and queue prioritization.

Python AWS Developer

Deloitte Oct 2021 to Sep 2023

Built Flask and FastAPI services to ingest compliance-related datasets from external partners. Applied schema validation using Pydantic and jsonschema, integrated Redis caching for lookup data, and routed records into Redshift and DynamoDB with transformation steps tied to client-specific contracts.

Designed and managed Airflow DAGs for multi-region ETL workflows involving S3 ingestion, conditional validation, and batch inserts into Redshift. Configured SLA sensors, failure routing to dead-letter queues, and alerting integrations with Slack and CloudWatch dashboards for engineering triage.

Used Terraform to provision and manage AWS services including ECS clusters, Lambda functions, Parameter Store entries, and SQS queues. Applied scoped IAM policies, environment isolation, and automatic rollback strategies based on health checks and post-deploy verifications.

Integrated S3 event triggers with EventBridge and Lambda to launch ingestion pipelines on file arrival. Applied bucket-level tagging logic, routed events by file type, and coordinated downstream dependencies across validation, transformation, and storage services.

Implemented data validation frameworks using Great Expectations and Pandas. Enforced rules for null safety, type consistency, and column-level constraints before ingesting into warehouses. Built Slack bots to report failed validation results with row-level diffs and schema mismatch summaries.

Built Angular dashboards to help business teams monitor ingestion status, validate schema mismatches, and request manual reprocessing. Added custom filters, chart-based insights, and real-time log previews using WebSocket channels and REST APIs from backend services.

Developed Jenkins pipelines to run infrastructure validations, schema contract checks, and container image scans before deployment. Integrated Terraform plan/apply stages into gated workflows with approval-based promotion to production.

Developed GitHub Actions pipelines for CI workflows. Integrated Pytest with coverage tracking, OpenAPI schema contract checks, lint enforcement, and container build validations. Enforced branch protection rules and required build status checks for production merges.

Managed PostgreSQL schema evolution using Alembic with advisory locking and concurrent migration planning. Created pre-deploy hooks for schema diff detection and automated tagging of versioned release points across staging and production environments.

Logged ingestion activity and transformation errors using structured JSON logging and pushed logs into AWS CloudWatch. Instrumented services with request IDs, user context, and data payload markers to support detailed observability during incident reviews.

Created archival workflows using boto3 to snapshot Redshift tables and DynamoDB records into S3 with lifecycle policies. Applied metadata tagging for traceability, enabled periodic Glacier transitions, and scheduled compliance-oriented backups with audit retention requirements.

Supported event-driven architectures by wiring S3 event triggers and EventBridge rules into downstream consumers. Designed fault-tolerant retries and replay-safe pipelines with IaC-based deployments to ensure reproducibility.

Software Developer

Freshworks Apr 2019 to Mar 2021

Built RESTful APIs in Flask to manage customer account creation, ticket lifecycle handling, and agent assignment logic. Used SQLAlchemy for DB access, enforced data validation using Marshmallow, and implemented JWT-based auth for scoped access across customer, agent, and admin roles.

Created Pandas-based batch jobs scheduled via cron to generate daily reports from support interaction logs. Transformed raw activity data into summarized Excel reports, added SLA breach flags, and automated email delivery using AWS SES and IAM-based SMTP credentials.

Developed Angular-based support dashboards for internal agents to track ticket status, customer profiles, and SLA compliance. Integrated Redux for state management, applied role-based routing, and implemented filtering, sorting, and audit-log overlays with visual status indicators.

Introduced schema validation using Pydantic and Marshmallow in ingestion pipelines to enforce type safety, required fields, and regex-based checks before persisting data. Logged validation errors to S3 with request context for future analysis and automated ingestion replays.

Maintained PostgreSQL schema evolution using Alembic, creating safe DDL migrations for new tables, indexes, and constraints. Added version control for migrations, rollback capability during failed deploys, and staging checks for concurrent schema change scenarios.

Deployed services to ECS Fargate using Terraform, setting up task definitions, ALB health checks, and autoscaling policies. Isolated staging, QA, and production environments using tagged modules with scoped resource policies and parameterized configurations.

Implemented archival workflows for completed support tickets and audit logs using boto3 scripts. Applied bucket-level tagging and lifecycle policies for auto-deletion or Glacier transition after 90 days, meeting organizational data retention and compliance standards.

Added Pytest-based unit and integration tests with test factories, mocking, and DB setup/teardown fixtures. Integrated tests into GitHub Actions workflows to trigger on pull requests, enforce coverage thresholds, and prevent merges on failed builds or test regressions.

Instrumented structured logging across ingestion, transformation, and API response layers using JSON formatters. Included trace IDs, user context, and status codes to improve observability and support downstream log aggregation and filtering in CloudWatch Logs.

Participated in OpenAPI schema design for customer- and agent-facing endpoints. Helped define resource models, response contracts, and versioning strategies. Aligned API documentation with frontend contracts to prevent mismatches during frontend/backend integration.

Led efforts to migrate legacy support data from MongoDB to PostgreSQL by flattening nested documents, transforming schemas, and applying row-level validation. Used batch ETL scripts with rollback capabilities and metrics reporting to ensure clean, consistent loads.

Technical Skills:

Programming & Frameworks: Python (FastAPI, Flask, SQLAlchemy, Pandas, boto3, Alembic, Marshmallow, Pydantic), JavaScript/TypeScript (React, Angular, Redux, WebSockets, Tailwind CSS), Shell scripting, Cron jobs

Cloud & AWS Services: ECS (Fargate, EC2-backed), Lambda, S3 (versioning, lifecycle, Glacier transitions, event triggers), DynamoDB, Redshift, PostgreSQL, EventBridge, SQS, SNS, Parameter Store, IAM (scoped roles, auth), X-Ray, CloudWatch (logs, metrics, alarms, dashboards)

Containerization & Orchestration: Kubernetes (manifests, Helm charts, ConfigMaps, rollout validation, kubectl), Docker, Prometheus/Grafana monitoring

Infrastructure as Code & Automation: Terraform (IaC, multi-environment provisioning, rollback strategies), Jenkins (infrastructure validations, container scans, schema checks), GitHub Actions (CI/CD workflows, coverage enforcement, schema validation, container build validations)

Data Engineering & ETL: Apache Airflow (ETL DAGs, SLA monitoring, conditional logic, retries, alert routing), Pandas (batch data jobs, reporting, transformations), Great Expectations (schema & data validation), JSON/structured logging for ingestion pipelines

Databases & Schema Management: PostgreSQL (migrations, DDL, indexing, constraints), Redshift (time-partitioned clustering, table design), DynamoDB (hash keys, schema definition), Alembic (schema evolution, advisory locking, diff checks), MongoDB (legacy migration)

Testing & Validation: Pytest (unit, integration, mocking, fixtures, DB setup/teardown), OpenAPI contract testing, GitHub Actions test automation, Jenkins validation pipelines

Observability & Logging: JSON-based structured logs (trace IDs, audit logs), CloudWatch dashboards & alarms, Prometheus exporters (node health, GPU utilization, scheduling delays), Slack bots for alerts and ingestion failures

Contact this candidate