Ansh Patel
+1-469-***-**** - *************.****@*****.*** - linkedin.com/in/patelansh110 - github.com/ANSH15007 EDUCATION
The University of Texas At Dallas, Richardson, TX, USA Aug 2022 - May 2024 Master of Science in Computer Science GPA : 3.70/4 Gujarat Technological University, Ahmedabad, India Aug 2018 - May 2022 Bachelor of Engineering in Information and Communication Technology GPA : 3.80/4 TECHNICAL SKILLS
Languages:Python, Java, SQL, NoSQL, Linux
Cloud:AWS (EC2, S3, Lambda, Glue, Redshift, Kinesis, CloudFormation, CloudWatch), Databricks Tools:Apache Spark, Apache Airflow, Kafka, Hadoop, Dbt, Informatica DevOps:Docker, Kubernetes, Jenkins, Terraform, GitHub Actions Databases:MySQL, PostgreSQL, DynamoDB, MongoDB, Snowflake Data Visualization: Tableau, PowerBI, AWS Quicksight WORK EXPERIENCE
Data Engineer
AKS Infotech Inc, NJ, US Remote Jun 2024 - Present
• Engineered robust data pipelines using Apache Airflow and Databricks for a retail client, processing daily ecommerce transactional data across 20+ microservices, achieving 99.95% uptime.
• Implemented ETL processes using AWS Glue and Databricks Auto Loader for customer purchase history and inventory management data, reducing processing time by 40% and improving data quality.
• Optimized Spark jobs on Databricks, resulting in a 30% reduction in processing time for large-scale data transformations.
• Developed Python frameworks for data integration, implementing Infrastructure as Code using Terraform for AWS resource management.
Data Engineer
Digital Sky 360, Ahmedabad, India On-site May 2020 - June 2022
• Engineered real-time IoT data streaming architecture using Apache Kafka and AWS Kinesis, processing 5k+ events per second from industrial sensors, reducing latency by 28% for critical business metrics including equipment performance monitoring.
• Implemented comprehensive data quality checks and governance policies for IoT data, including validation, metadata management, and access controls, improving overall data accuracy by 15% across all pipelines.
• Containerized data processing applications using Docker and Kubernetes, enabling seamless deployment across development and production environments, reducing deployment failures by 65% and MTTR by 43%. PROJECTS
ShopSense - Real-Time Fashion Analytics Pipeline
• Developed a scalable data pipeline using AWS Kinesis, Lambda, Databricks Delta Lake and S3 to analyze ASOS public dataset (15GB) containing 50K+ daily clickstream events, enabling real-time customer behavior analysis and personalized recommendations.
• Orchestrated the ETL workflow using Apache Airflow with automated data transformations and quality checks using AWS Glue, Databricks, and Dbt, reducing manual interventions by 40%. SentiTrack - Social Media Brand Perception Analyzer
• Built an end-to-end pipeline to collect and analyze Nike and Adidas public Twitter data (25GB dataset) using Twitter API, AWS Glue, and AWS Redshift, processing 50K tweets daily for brand sentiment analysis
• Designed a star schema in Redshift with automated alerting via Amazon SNS for negative sentiment spikes, reducing average query time by 50% and enabling faster response to potential PR issues GroceryLens - Retail Analytics Platform
• Designed a star schema data model in Snowflake using Instacart's public dataset (8GB), integrating purchase history, product inventory, and customer loyalty information from multiple retail channels
• Built ETL workflows using Databricks and PySpark, processing 100K+ daily transactions with Delta Lake partitioning for optimized storage and enhanced query performance for sales trend visualization COURSES AND CERTIFICATIONS
• AWS Cloud Practitioner Essentials offered by AWS Training in May 2024
• DevOps Foundations offered by SKILLup IT/DevOps Institute in March 2023