Job Description
We’re looking for a Data Platform Engineer to take end-to-end ownership of large-scale data pipelines and distributed systems powering a next-generation Graph Analytics platform.
What You’ll Own
Data Pipeline & Platform Engineering
Design and build end-to-end data pipelines (batch + streaming)
Own large-scale Apache Spark workloads and distributed data processing
Implement data ingestion transformation serving layers
Manage schema evolution, data contracts, and pipeline reliability
Distributed Systems & Scale
Work on systems handling high-volume graph datasets (entities + relationships)
Optimize for latency, throughput, and fault tolerance
Design scalable architectures using Kafka / Spark / Flink / Beam
Cloud & Infrastructure
Deploy and operate systems on GCP / AWS (GKE, Dataproc, Cloud Run, etc.)
Build and maintain CI/CD pipelines for data and microservices
Use Docker, Kubernetes, Terraform for infrastructure automation
Data Reliability & Observability
Implement data quality checks, monitoring, and alerting
Ensure data integrity across pipelines and services
Build systems to detect drift, inconsistencies, and failures in production
APIs & System Integration
Work with GraphQL / REST / gRPC APIs for data access layers
Ensure seamless integration between data systems and application layers
What We’re Looking For
You have atleast 4 years in Data Engineering / Platform Engineering / Distributed Systems
Strong hands-on experience with: Apache Spark / Distributed data processing, Cloud platforms (GCP or AWS), Streaming systems (Kafka / Flink / Beam)
Solid programming skills in Python / Java / Scala / Node.js
Experience building and owning production data pipelines end-to-end
Understanding of: Microservices architecture, Data modeling & large-scale system design
Ability to debug and optimize systems in real production environments
Why This Role is Different
You own systems, not just components
You work on real scale (millions billions of data points)
You solve distributed systems + graph + real-time problems
You operate close to production impact, not isolated dev work
You influence architecture from day one in an early-stage environment
What You’ll Get
High ownership, low bureaucracy environment
Work on cutting-edge graph + AI-driven data systems
Exposure to complex, real-world data problems (fraud, risk, intelligence)
Fast growth with direct impact on core platform architecture