Post Job Free
Sign in

Data Engineer Power Bi

Location:
Arlington, TX
Posted:
September 10, 2025

Contact this candidate

Resume:

YASH PARMAR

682-***-**** ******.********@*****.*** Dallas, Texas linkedin.com/in/yashparmar1416 GitHub EDUCATION

The University of Texas at Arlington Arlington, Texas Master of Science in Information Systems Aug 2023 - May 2025 AWS Certified Cloud Practitioner (Feb 2025), Agile Foundations (May 2024), Python Data Analysis (Jan 2024) Savitribai Phule Pune University Pune, India

Bachelor of Engineering in Electrical Engineering Aug 2017 - Jun 2021 EXPERIENCE

Data Engineer Arlington, Texas

PolarisIQ Sep 2023 - May 2025

• Spearheaded development of an AI-native BI platform using Lang-Chain agents, Fast-API, Spark, and Kafka pipelines, enabling real-time processing of 10M+ daily events.

• Boosted decision speed by 35% and user engagement by 28% via ML-driven A/B testing, user segmentation, and feature optimization using Pandas, Scikit-learn, and Spark ML library.

• Engineered scalable ETL pipelines with anomaly detection using Airflow and Spark Streaming, processing 5M+ Kafka event daily into Big-Query for predictive analytics and live reporting.

• Enabled real-time strategic insights by embedding LLM-driven summaries and AI-based anomaly surfacing into Power BI and AWS Quick-Sight dashboards visualizing 20+ KPIs for 50+ stakeholders.

• Skills: Lang-Chain, Fast-API, Apache Spark, Kafka, Airflow, Spark ML-lib, Big-Query, Power BI, AWS Quick-Sight. Data Engineer Pune, India

Tata Consultancy Services Aug 2021 - Aug 2023

• Developed anomaly detection framework using Isolation Forest and autoencoders in Python with AWS S3/ServiceNow integration, cutting downtime 75% and saving $50K monthly revenue loss.

• Automated ETL workflows processing 500GB daily with ML-based data validation using Python classification models, ensuring 100% data integrity and reducing manual effort 85%.

• Designed AI-augmented Tableau dashboards serving 200+ users with K-Means clustering and Prophet forecasting, increasing data-driven decisions 20% and improving customer satisfaction scores.

• Improved SLA compliance 35% by deploying ServiceNow-integrated spaCy NLP classifiers for auto-tagging incidents, enabling real-time escalation prioritization across 5 operational departments.

• Skills :Isolation Forest, Autoencoders, spaCy, AWS S3, ServiceNow, Tableau, Power BI, Prophet, SQL, ETL Pipelines. TECHNICAL SKILLS

Programming Languages: Python (NumPy, Pandas, Scikit-learn), R, SQL (Join, CTE, Subqueries), C, HTML. Frameworks and Database: Flask, Scikit-learn, NLTK, Kafka Streams, Spark ML lib, Google Big Query, Cloud SQL, PostgreSQL. Cloud and Infrastructure: EC2, S3, Glue, Lambda, SQS, RDS, Terraform, Docker, Kubernetes, ETL, Apache Airflow, Data Pipelines. Problem Analysis: Product Metrics, A/B Testing, Data driven Insights, Scrum, Kanban, Agile, Jira, EDA, Hypothesis Testing, Six Sigma, ServiceNow.

Visualization: Microsoft Power-Bi, Tableau, Microsoft Excel, Alteryx, AWS Quick Sight. PROJECTS

Real-Time Log Analytics & Threat Detection Pipeline AWS, Kafka, Spark, Elasticsearch, TensorFlow, Grafana Jan 2025

• Engineered end-to-end data pipeline ingesting 1M+ log events daily using Kafka and Spark, ensuring scalable storage in Elasticsearch.

• Integrated TensorFlow-based ML models with Grafana dashboards, achieving 95% detection accuracy, 20% faster incident response, and 99.9% pipeline uptime.

End to End Data Pipeline for EV Factory Sensor Data Kafka, Airflow, Spark, Python, FastAPI, PostgreSQL Aug 2024

• Built streaming pipelines for 2M+ IoT sensor events with Kafka and Spark, orchestrated using Airflow and persisted into PostgreSQL.

• Exposed processed insights through FastAPI services and real-time dashboards, reducing latency by 40% and ensuring SLA compliance across $25M operations.

Customer Segmentation & Churn Prediction Pipeline SQL, PySpark, AWS EMR, Tableau, Flask Feb 2024

• Developed PySpark churn models and automated AWS EMR workflows, delivering API-based predictions through Flask to integrate with business systems.

• Designed Tableau dashboards for campaign optimization, reducing churn by 12%, boosting CTR 15%, and accelerating reporting speed by 40%.

NLP-Driven Sentiment & Theme Analysis on Amazon Reviews AWS Glue, Redshift, SQL, spaCy, LDA, Power BI Aug 2023

• Implemented full-stack NLP pipeline with AWS Glue ETL, Redshift storage, and spaCy LDA models to detect sentiment and customer themes.

• Delivered Power BI dashboards with governance framework, improving satisfaction 10% and ensuring full compliance.



Contact this candidate