PRAVASHISH MANDIRAM
Austin, TX 512-***-**** ***********@*****.***
Summary
Data Engineer with 5+ years of experience specializing in building scalable, cloud-native data pipelines and analytics platforms for e-commerce and fintech sectors. Expert in designing and implementing end-to-end ETL processes using PySpark, SQL, and AWS/GCP/Azure services to drive data-driven decision- making. Proven ability to ensure data integrity, optimize data warehousing solutions, and collaborate with cross-functional teams to deliver actionable insights that reduce costs and improve operational efficiency. Skills
• Programming Languages: Python, R, Scala, SQL, PySpark, Java, Git
• Web Development: HTML5, CSS, JavaScript
• Data Warehousing: AWS Redshift, Azure SQL Data Warehouse.
• ETL & Data Modeling: ETL Processes, Data Warehousing, Data Modeling, Apache NiFi, Informatica PowerCenter, Apache Flink, Apache Druid, Apache Beam, Medallion Architecture
• Project Methodologies: Agile, Waterfall Visualization & Reporting: Tableau, Power BI, Excel, SAS, SQL Playground
• Machine Learning: Logistic Regression, Decision Trees, Random Forests, PyTorch, AWS SageMaker
• Statistical Analysis: Linear Regression, ANOVA, Chi-Square
• Python Libraries: NumPy, Pandas, Matplotlib, SciPy, Scikit learn, Seaborn, TensorFlow
• Databases: MySQL, PostgreSQL, SQL Server, AzureDB, MongoDB, Cassandra, NoSQL Databases
• Big Data Technologies: Apache Spark, Apache Hadoop, Apache Kafka
• Cloud Platforms & Services: AWS (S3, Redshift, Glue, EMR, EC2, Lambda), Azure (Data Factory, Databricks, Data Lake, Blob, Cosmos DB, ADLS, Synapse Studio), GCP BigQuery
• Orchestration & Workflow: Apache Airflow, Azure Logic Apps
• Containerization & IaC: Docker, Kubernetes, Terraform
• CI/CD & Version Control: Jenkins, Git
Experience
Data Analytics Engineer Tessolve Semiconductor Inc. Austin, TX 05/2025 – Current
• Architected and deployed a scalable cloud data warehouse on AWS Redshift, consolidating data from 15+ test equipment sources to create a single source of truth for engineering analytics.
• Developed automated ETL pipelines using PySpark and AWS Glue to process and validate over 2TB of daily semiconductor test data, improving data availability for analysis by 95%.
• Engineered a suite of Tableau dashboards for yield analysis and failure mode detection, enabling engineers to identify root causes 40% faster and reducing test cycle time.
• Implemented data quality frameworks and automated anomaly detection alerts, decreasing data integrity issues by 30% and increasing trust in analytical reporting.
• Collaborated with validation engineers to define key performance indicators (KPIs) and translate business requirements into technical specifications for data models.
• Orchestrated complex data workflows using Apache Airflow, ensuring reliable and timely daily batch processing for downstream reporting and machine learning applications.
Data Analytics Engineer Virtue Serve Texas, USA (Remote) 12/2023 – 05/2025
• Delivered data engineering solutions for clients in the e-commerce and retail sectors, focusing on marketing and customer analytics.
• Migrated an on-premise client database to Google BigQuery, optimizing query performance and reducing monthly infrastructure costs by 22%.
• Built real-time data pipelines using SQL and Python to integrate Google Analytics 4 data with CRM platforms, enabling a unified view of customer journey and attribution.
• Automated the generation and distribution of weekly performance marketing reports to stakeholders, saving 15+ person-hours per week and accelerating insight delivery.
• Designed and implemented dimensional data models in BigQuery to support complex analytical queries for customer segmentation and lifetime value (LTV) analysis.
• Partnered with data scientists to productionize a recommendation engine model by building a feature store and serving layer using Dataproc and Cloud Functions. Data Engineer Mindtree Limited Bangalore, India 10/2021 – 07/2022
• Developed and optimized PySpark scripts for processing large-scale financial transaction data, improving the efficiency of a critical daily ETL job by 35%.
• Contributed to the design of a star-schema data warehouse on Azure Synapse Analytics to support business intelligence and regulatory reporting needs.
• Wrote complex SQL queries and stored procedures to transform raw banking data into actionable insights for fraud detection and risk management teams.
• Implemented data validation checks within Azure Data Factory pipelines, ensuring 99.8% accuracy in daily financial reconciliations. Data Engineer OLX Remote, India 09/2020 – 09/2021
• Pioneered the migration of core batch processing jobs from legacy systems to a distributed Spark framework on AWS EMR, reducing data processing latency for ad listing data by 40%.
• Designed and implemented a real-time event tracking pipeline using Kafka and AWS Kinesis to capture 5M+ daily user interactions, enabling the product team to analyze user behavior and personalize the homepage.
• Developed automated data quality frameworks using Great Expectations, identifying and resolving 15% of data discrepancies at ingestion, which significantly improved the reliability of business-critical metrics.
• Optimized performance and cost of Hive and Presto queries by refining table partitioning and bucketing strategies, resulting in a 25% reduction in cloud compute spending for the analytics team.
• Collaborated with data scientists to productionize a machine learning model for ad fraud detection by building a feature engineering pipeline that processed terabytes of historical transaction data.
• Authored technical documentation and runbooks for key data pipelines, standardizing best practices and reducing the onboarding time for new team members by 50%. Database Administrator One Card Remote, India 06/2019 – 09/2020
• Spearheaded the database design and implementation for a new customer loyalty program, creating schemas and writing optimized stored procedures that handled a 50% increase in transaction volume without performance degradation.
• Achieved 99.99% database availability for core PostgreSQL clusters through proactive monitoring, performance tuning, and implementing a robust disaster recovery strategy using WAL archiving and point-in-time recovery.
• Enhanced database security and compliance with PCI-DSS standards by automating vulnerability scans, encrypting sensitive customer PII at rest, and rigorously auditing user access privileges.
• Slashed report generation times for the finance team by 60% by optimizing complex SQL queries and creating materialized views for recurring analytical requests on transaction data.
• Automated routine maintenance tasks such as vacuuming, indexing, and backups using Python scripts, reclaiming 10 hours of manual work per week for the DevOps team.
Education
Master of Engineering: Computer Science
University of Cincinnati
2024
OH, USA