Post Job Free
Sign in

Data Engineer with AI/ML Platform Expertise

Location:
Frisco, TX
Salary:
150000
Posted:
January 12, 2026

Contact this candidate

Resume:

DIVYA SANKOJU

Data Engineer with AI/ML

+1-513-***-**** **************@*****.*** LinkedIn

PROFESSIONAL SUMMARY

Data Engineer and AI/ML practitioner with 5+ years of experience architecting cloud- native data solutions and machine learning pipelines. Combines expertise in AWS, Azure, and Snowflake with advanced skills in distributed data processing and model operationalization to deliver scalable infrastructure supporting analytics and AI initiatives. Proven ability to design and implement end-to-end data platforms that transform raw data into actionable insights, driving business intelligence and predictive capabilities. Adept at collaborating with cross- functional teams to translate complex requirements into efficient, maintainable, and secure data architectures. Consistently improves system performance, reduces operational costs, and accelerates time-to-insight through automation and optimization.

TECHNICAL SKILLS

●Cloud Platforms: AWS (S3, Glue, Lambda, Redshift, EMR, SageMaker, Step Functions, CloudFormation), Azure (Data Factory, Databricks, Synapse, Blob Storage, DevOps), Snowflake, Google Cloud Platform (BigQuery, Composer)

●Data Processing & ETL: Apache Spark (PySpark, Spark Streaming, Spark MLlib), Apache Airflow, Apache Kafka, AWS Kinesis, Hadoop (HDFS, Hive, YARN), dbt, Fivetran, Stitch

●Databases & Warehousing: PostgreSQL, MySQL, MongoDB, Amazon Redshift, Snowflake, DynamoDB, SQL Server, Cassandra, Oracle, Data Vault 2.0 modeling

●AI/ML Operations: Scikit-learn, TensorFlow, PyTorch, SageMaker, MLflow, feature engineering, model deployment, model monitoring, Kubeflow, ONNX

●Infrastructure & DevOps: Docker, Kubernetes, Terraform, CloudFormation, Git, GitHub Actions, Jenkins, GitLab CI, ArgoCD, Prometheus, Grafana, Helm

●Programming & Scripting: Python, SQL, Scala, R, Bash/Shell Scripting, PowerShell, JavaScript, Java

●Data Visualization & BI: Tableau, Power BI, AWS QuickSight, Looker, Metabase, Superset

●Data Governance & Quality: Great Expectations, Apache Atlas, Collibra, Alation, Data Catalog, lineage tracking

●Methodologies: Agile/Scrum, DataOps, MLOps, CI/CD, Test-Driven Development, Data Mesh principles

PROFESSIONAL EXPERIENCE

Atlassian Data Engineer with AI/ML Austin, TX Feb 2024 – Present Enterprise SaaS – Collaboration & Productivity Software

●Automated ETL pipelines using AWS Glue and Step Functions, consolidating user telemetry data from 5+ product lines into centralized S3 data lakes, reducing manual processing time by 70%

●Engineered feature stores in SageMaker that accelerated model development cycles by 30% for user behavior prediction initiatives

●Implemented real-time streaming applications with Apache Kafka and AWS Kinesis, enabling live monitoring dashboards that decreased incident response time from hours to minutes

●Optimized Redshift clusters through query restructuring and distribution strategies, lowering monthly infrastructure costs by 25% while maintaining query performance

●Operationalized 8+ machine learning models by creating reusable deployment templates with MLflow, standardizing A/B testing frameworks across data science teams

●Established automated data quality validation using Great Expectations, increasing reliability of critical business datasets from 92% to 99.5% accuracy

Designed and implemented data access control frameworks using AWS Lake Formation and IAM policies, ensuring compliance with GDPR and CCPA regulations

●Developed automated monitoring alerts and dashboards using CloudWatch and Grafana, reducing mean time to detection (MTTD) for data pipeline failures by 65%

●Created comprehensive data documentation and cataloging systems that improved data discoverability for 150+ analysts and data scientists

●Mentored 3 junior data engineers on best practices for pipeline development, code review processes, and cloud architecture patterns

Environment: AWS (S3, Redshift, Glue, Lambda, SageMaker, Kinesis, Step Functions, CloudFormation), Apache Kafka, Python, PySpark, SQL, Snowflake, MLflow, Docker, Kubernetes, GitLab CI, Tableau, Great Expectations, Grafana

DXC Technology Data Engineer Hyderabad, India May 2019 – Jul 2023 IT Services & Consulting – Multiple Client Engagements

●Migrated 3 legacy on-premise data warehouses to Azure cloud infrastructure, utilizing Data Factory and Databricks to improve data processing speed by 40%

●Designed dimensional data models in Azure Synapse Analytics supporting 15+ enterprise business intelligence reports used by 200+ stakeholders

●Developed PySpark transformation scripts processing 5+ terabytes of daily data, achieving 99.8% data accuracy through automated validation frameworks

●Scheduled 50+ automated data pipelines using Apache Airflow, ensuring 100% on-time delivery of daily financial and operational reports

●Translated business requirements into technical specifications for 10+ Power BI dashboards, enabling data- driven decision making for client leadership teams

●Identified and resolved performance bottlenecks in Spark jobs and SQL queries, improving overall data platform efficiency by 35%

●Implemented incremental data loading strategies that reduced nightly batch processing windows from 8 hours to 3 hours

●Built data reconciliation frameworks that automatically identified and flagged discrepancies between source systems and data warehouses

●Developed REST APIs using FastAPI to expose curated datasets to downstream applications, reducing dependency on direct database access

●Conducted technical workshops and training sessions for client teams on data platform usage, best practices, and self-service analytics capabilities

●Created disaster recovery plans and implemented automated backup strategies for critical data assets across multiple Azure regions

Environment: Microsoft Azure (Data Factory, Databricks, Synapse, Blob Storage, DevOps), Apache Spark (PySpark, Spark MLlib), Apache Airflow, SQL Server, PostgreSQL, PowerShell, Python, Power BI, Git, Agile/Scrum, FastAPI

CERTIFICATIONS

●AWS Certified Cloud Practitioner

●Microsoft Azure Fundamentals (AZ-900)

●Snowflake Data Engineer (In Progress)

EDUCATION

●Master of Science in Information Technology University of Cincinnati Cincinnati, OH

●Bachelor of Technology in Electronics & Communication Engineering CVR College of Engineering India



Contact this candidate