Data Engineer Power Bi

Location:

Seattle, WA

Salary:

75000

Posted:

October 15, 2025

Contact this candidate

Resume:

Nidhi Trivedi

Seattle, WA Open to Relocation

Phone: 206-***-**** Email: ******************@*****.*** LinkedIn: https://www.linkedin.com/in/nidhitrivedi24/ GitHub: https://github.com/NidhiTrivedi24 Summary

Data Engineer with 7 years of experience in building and scaling data pipelines, ETL workflows, and BI solutions across financial services and technology domains. Proven track record in processing multi-billion record datasets, integrating 50+ data sources, and deploying on AWS (S3, EMR, Redshift, Lambda). Skilled in Python, SQL, PySpark, Kafka, Hive, and Snowflake, with expertise in data quality, governance, and real-time streaming for business-critical reporting. Strong collaborator with cross-functional teams, delivering 30% faster data delivery timelines and driving measurable business impact through automation, dashboards, and cloud-native solutions. Technical Skills

Python Java SQL R HTML C++ C Shell Scripting Swift SDLC OOP OLTP OLAP AWS S3 EMR EC2

DynamoDB Redshift Lambda CloudWatch Docker Kubernetes Jenkins Airflow CI/CD Git JIRA Agile Scrum Hadoop Hive Spark HDFS Sqoop Kafka Snowflake Oracle MySQL MongoDB DynamoDB Tableau Power BI Amazon QuickSight Excel (Advanced) Splunk Grafana Pandas NumPy Scikit-learn Matplotlib Seaborn TensorFlow Keras Exploratory Data Analysis (EDA) Statistical Analysis Hypothesis Testing Data Quality Management Data Governance KPI Definition Metric Design Root Cause Analysis Reporting Automation Business Process Improvement

Experience

Senior Data Engineer Mar 2022 - Sept 2023

TransUnion Pune, MH

Tech Stack : PySpark, Apache Kafka, Hive, AWS (S3, EMR, Redshift), SQL, Power BI, Jenkins, Python

● Developed and maintained ETL pipelines using PySpark, Spark, and Hive scripts, processing over 2B+ consumer and commercial records monthly and reducing data extraction time by 27%

● Built API-based ingestion workflows from third-party financial sources, consolidating 55+ feeds into a unified model that improved data availability by 40%

● Automated data quality checks for completeness, duplicity, and accuracy across 65+ client datasets, leading to a 67% improvement in validation coverage

● Leveraged AWS S3, EMR, and Redshift to deploy scalable workflows, reducing storage cost by 20% and improving processing speed by 3 times

● Used Apache Kafka to stream real-time data, enabling faster reporting and improving customer insights by 45%

● Built 8+ Power BI dashboards tracking file-level and business KPIs, leading to a 32% increase in stakeholder visibility and adoption

● Collaborated with data scientists and business analysts on 7+ cross-functional projects, improving data delivery timelines by 30%

Data Engineer Oct 2016 - Mar 2022

LTI- Larsen & Toubro Infotech Pune, MH

Tech Stack : Oracle, HiveQL, Hadoop, AWS EMR, Redshift, SQL, Python, Power BI, Jenkins, Docker

● Built ETL workflows using Oracle, Python, and SQL to process 7M+ financial records, increasing data migration speed by 20% with 97% accuracy

● Developed Hadoop streaming jobs on AWS EMR for text analysis of logs, improving search data quality and indexing relevance by 35%

● Created HiveQL scripts for complex aggregations and joins across multi-TB datasets, reducing reporting latency by 50%

● Integrated data from 10+ APIs, enhancing downstream reporting accuracy by 25% and reducing manual reconciliation efforts by 87%

● Created 22+ SQL stored procedures and views in Redshift, improving query performance by 37% and reducing ad-hoc reporting requests

● Built 15+ Power BI dashboards, enabling leadership to monitor financial KPIs in real-time and improving decision-making speed by 33%

● Created CI/CD pipelines using Jenkins and Docker, cutting manual deployment efforts by 40% and improving release frequency by 60%

Projects

Outside Sales Staff Calculator PACCAR (Capstone Project)

● Summary: Designed a decision-support system to optimize outside sales staff allocation using dealership sales, truck service, geolocation, and part replacement data (280M+ records).

● Core Responsibilities: Engineered ETL pipelines in Snowflake SQL and Python, developed a Quasi-Newton optimization framework for staffing decisions, and built interactive dashboards in Streamlit for scenario exploration.

● Tools / Languages: Python (Pandas, NumPy, SciPy, Streamlit), Snowflake SQL, Git, Matplotlib/Seaborn Bird Species Classification using Deep Learning Github

● Summary: Built audio-based deep learning models to classify bird species from their calls using spectrogram analysis

● Core Responsibilities: Developed CNN models, implemented NLP techniques, and optimized multi-class classification accuracy

● Tools / Languages: Python, TensorFlow/Keras, Librosa, NumPy, Matplotlib Large-Scale Data Processing with Hadoop Github

● Summary: Processed large log datasets using Hadoop streaming and MapReduce to generate per-minute analytics summaries

● Core Responsibilities: Wrote MapReduce scripts, automated job execution with Shell scripts, and reduced processing time by 30%

● Tools / Languages: Hadoop, MapReduce, Shell scripting, Python Education

MS in Data Science Seattle University, Seattle, WA, USA BE in Computer Science Rajiv Gandhi Technical University, India

Contact this candidate