Post Job Free
Sign in

Data Engineer - Cloud Data Architect & ETL Specialist

Location:
Long Beach, CA
Posted:
April 06, 2026

Contact this candidate

Resume:

Mayank Thacker

562-***-**** Long Beach, CA ******.*******.****@*****.*** LinkedIn Portfolio GitHub SUMMARY

Data-focused Software Engineer with 5+ years of experience designing database architectures, building high-throughput data pipelines, and deploying cloud-based data infrastructure on AWS, Azure, and GCP. Skilled in converting raw, unstructured data into reliable analytical assets for data scientists and business analysts. Hands-on experience implementing AI/ML-driven data quality frameworks, automating ETL workflows, and building monitoring dashboards. M.S. in Computer Science with a strong foundation in data engineering, analytics, and cross-functional collaboration. WORK EXPERIENCE

Software Engineer Kentagious Kollective LLC (Houston, TX) June 2025 – Present

• Designed and maintained relational database architecture for a SaaS platform, structuring schemas and writing optimized SQL query logic to expose clean, queryable data to dashboards and analytics layers serving 10K+ active users.

• Led a large-scale data migration, restructuring legacy schemas and executing transactional SQL scripts to convert and persist 10K+ records with validation checks ensuring data integrity and zero data loss.

• Built admin-facing analytics dashboards transforming raw operational data into actionable insights on system usage, user behavior, and business KPIs for stakeholders and decision-makers.

Software Engineer Savanti LLC (Miami, FL) Sept 2024 – May 2025

• Architected MongoDB schemas and built Python microservices on Azure (Functions, Blob Storage) to collect, manage, and convert raw market data into structured, queryable formats for data scientists and business analysts.

• Implemented data quality frameworks with AI-driven anomaly detection to enhance data reliability, flagging schema violations and inconsistencies before ingestion into downstream analytics and ML training pipelines.

• Engineered high-throughput data pipelines ingesting 1M+ records/minute from external APIs with optimized batching and concurrency, feeding preprocessed data into ML model workflows via automated n8n ETL orchestration. Software Engineer Warble Solutions (Ahmedabad, India) Aug 2021 – Aug 2023

• Led a team of five engineers to scale Node.js APIs for e-commerce systems serving 100K+ daily users, integrating ML model outputs into production to power personalized search and recommendation features.

• Built resilient ETL workflows using Apache NiFi to convert unstructured NoSQL datasets into structured analytical assets on AWS S3, simplifying downstream data processing and reporting pipelines.

• Architected automated data migrations from on-premise databases to AWS (S3, DataSync), designing cloud database architecture and building data quality monitoring alerts with Prometheus and Grafana. Software Development Intern Rhad Agency (Ahmedabad, India) Jan 2019 – July 2021

• Built full-stack e-commerce features (React, Node.js) with real-time dashboards converting raw transactional data into visual operational insights, and wrote advanced SQL queries exposed through FastAPI endpoints.

• Developed modular PHP APIs for backend data operations including authentication, cart management, and checkout, increasing code reuse and reducing data duplication across services.

TECHNICAL SKILLS

Languages: Python, Java, C++, JavaScript, TypeScript, SQL Databases & Big Data: PostgreSQL, MongoDB, MySQL, Redis, DynamoDB, Apache Spark, Airflow, Splunk Cloud & DevOps: AWS (S3, DataSync, Lambda), Azure (Functions, Blob Storage), GCP, Docker, Kubernetes, Terraform, CI/CD Data Engineering: ETL/ELT Pipelines, Data Quality Frameworks, Apache NiFi, n8n, Pandas, Data Modeling, Schema Design AI/ML: PyTorch, TensorFlow, Hugging Face, MLflow, OpenAI, Llama3, Anomaly Detection, LLM Prompt Engineering Web & Tools: React.js, Node.js, FastAPI, Flask, REST, GraphQL, Microservices, Git, Tableau, Prometheus, Grafana PROJECTS & PUBLICATIONS

Cloud Data Platform for Real-Time Market Analytics

• Designed and deployed an end-to-end cloud data platform on Azure, architecting the database layer (MongoDB), ingestion pipelines, and data transformation services to convert raw high-frequency data into clean analytical datasets with automated AI-driven quality checks. Client Operations SaaS — Data Architecture & Analytics Dashboard

• Architected the relational database schema and data processing workflows for a full-stack SaaS application (React, Next.js, Node.js, SQL), powering analytics dashboards surfacing KPIs, usage trends, and operational metrics from raw transactional data. Society Sync — AI-Driven System Architecture (Elsevier Publication, March 2022)

• Published research on AI-driven data architecture for smart residential communities, covering data-driven automation, sensor data processing, and intelligent emergency response systems in Elsevier’s journal on sustainable computing. EDUCATION

California State University, Long Beach M.S. in Computer Science (Long Beach, CA) Aug 2023 – May 2025 Gujarat Technological University B.E. in Information Technology (Ahmedabad, India) Jul 2017 – Jul 2021



Contact this candidate