Power Bi Data Engineer

Location:

United States

Salary:

90000

Posted:

September 10, 2025

Contact this candidate

Resume:

MAMATHA REDDY

+1-203-***-**** ***************@*****.***

PROFESSIONAL SUMMARY

Data Engineer with 4+ years of experience in scalable data solutions and ETL pipeline optimization, proficient in SQL, Power BI, and Python. Skilled in training end-users and enhancing query performance through advanced indexing, partitioning, and DAX query development. Experience with cloud platforms such as AWS and Azure, and a strong background in data modeling and BI tool integration. TECHNICAL SKILLS

• Programming Languages: Python, SQL, Scala, Java, Bash, R

• Big Data Technologies: Hadoop, Apache Spark, Hive, HBase, MapReduce, PySpark, Kafka

• Data Engineering & ETL: ETL, ELT, Data Pipelines, Data Integration, Data Ingestion & Transformation, DBT, Airflow, SFTP, SSIS

• Cloud Platforms & Services: AWS (Lambda, S3, Glue, EC2, CloudWatch, SQS, SNS), Azure (ADF, Synapse, Data Lake Storage Gen2, Databricks), GCP (Big Query)

• Databases: PostgreSQL, MySQL, SQL Server, Oracle, MongoDB, DynamoDB

• Data Warehousing: Snowflake, Redshift, Data Modelling, Data Migration, Data Normalization, Data Management, Data Governance

• BI & Visualization: Power BI, Tableau, DAX Queries, QlikView

• DevOps & Workflow: Git, GitHub, Jenkins, CI/CD, Agile, JIRA, Slack

• Advanced Analytics: Data Modelling, Statistical Analysis, Feature Engineering, Predictive Modelling, Time Series Analysis

• Machine Learning Basics & AI: Random Forest, Boost, Linear/Logistic Regression, Clustering, PCA, Time Series Forecasting

• Domain Expertise & Certifications: Pensions Benefit Management, Data Analyst Certification WORK EXPERIENCE

Comerica Loans Jan 2024 - Present

Data Engineer

• Designed and implemented ETL pipelines in Azure Data Factory using PySpark in Azure Databricks to process over 10M loan and transaction data, demonstrating expertise in Azure and Python.

• Ingested daily data dumps from SQL Server into Azure Data Lake Storage Gen2, organizing storage into time-partitioned structures to support efficient query performance.

• Architected optimized Azure Synapse Analytics schemas using effective distribution and indexing strategies, aligning with best practices in data modelling and schema design for MS SQL Server environments.

• Crafted advanced SQL logic in Synapse SQL Pools to generate fraud detection KPIs and customer risk scores, enhancing query performance and reducing execution time by 35%.

• Developed incremental load strategies using watermarking and change data capture (CDC) in ADF pipelines, cutting pipeline latency by 60%.

• Automated data orchestration through Azure Data Factory pipelines and triggers, ensuring comprehensive error handling and end-to-end process traceability.

• Integrated Azure Monitor and Log Analytics for real-time pipeline performance tracking, aiding in prompt resolution of query performance issues.

• Built executive dashboards in Power BI that reduced reporting cycles from 3 days to 1 day, providing critical visual insights and training resources for end-users.

• Applied column-level encryption and real-time data masking in Synapse to ensure compliance with SOX and data privacy standards.

• Created data validation scripts in Python and SQL to verify record integrity, enforce schema constraints, and validate mapping logic.

• Tuned SQL queries and ETL jobs using caching, partitioning, and parallelism, improving execution time by up to 45% and demon- strating strong SQL and indexing capabilities.

• Participated in Agile sprint planning and backlog grooming sessions, effectively estimating effort for complex ingestion and transfor- mation tasks.

• Documented the complete data lifecycle in Confluence, ensuring detailed tracking of data lineage and schema version control.

• Conducted regular peer reviews for ADF pipelines, Databricks notebooks, and Synapse queries to maintain high standards in code quality and scalability.

Coforge Nov 2019 - Jul 2022

Data Engineer

• Built AWS Glue jobs to ingest web activity, campaign interactions, and CRM leads from MongoDB Atlas collections, showcasing proficiency in AWS and data integration.

• Transformed nested JSON into flattened, tabular formats using AWS Glue Dynamic Frames and PySpark scripts, increasing data processing speed by 40%.

• Established structured zone storage in Amazon S3 buckets across raw, staging, and curated layers, optimizing data organization and retrieval.

• Modeled over 15 dimensional data structures using Star and Snowflake Schemas, which contributed to a 20% improvement in query performance and aligns with MS SQL Server data modelling practices.

• Wrote and refined SQL scripts in Redshift to calculate conversion rates, session-to-lead attribution, and bounce trends, emphasizing performance review of SQL queries.

• Developed Change Data Capture (CDC) logic for MongoDB using custom timestamp fields to facilitate daily incremental loads.

• Validated data pipelines using Python scripts in AWS Lambda, ensuring data quality through rigorous checks for nulls, duplicates, schema drift, and timestamp anomalies.

• Deployed Glue workflows with rule-based partitioning, reducing ingestion time by 30% and demonstrating expertise in query partitioning.

• Collaborated with the DevOps team to implement CI/CD pipelines using AWS CodePipeline for deploying Glue and Redshift artifacts.

• Secured data access using AWS Secrets Manager and IAM roles, effectively eliminating risks associated with hardcoded credentials.

• Documented datasets and data flows in AWS Glue Data Catalogue, maintaining clear data lineage in Confluence.

• Assisted Tableau developers by creating query-optimized Redshift views, thereby enhancing dashboard performance and illustrating capability with BI tools.

• Streamlined dashboard refreshes by integrating AWS CloudWatch Events with Tableau Server extract schedules, facilitating improved end-user support.

• Collaborated with multiple cross-functional teams in Agile sprints, completing 95% of sprint deliverables on time and contributing to improved data delivery SLAs.

EDUCATION

University of New Haven

Master of Science, Business Analytics

Malla Reddy College of Engineering and Technology

Bachelor of Technology

PROJECTS

Fraud Detection & Risk Analytics Modernization

• This project modernized fraud detection by building scalable ETL pipelines on AWS to process large volumes of financial transactions.

• Data was ingested, cleansed, and stored in Amazon S3, transformed in Redshift, and monitored using AWS Step Functions.

• Automated SQL-based workflows ensured accuracy and compliance, while Power BI dashboards provided real-time insights to fraud analysts.

• The solution reduced detection latency, improved reporting precision, and enabled secure compliance-ready governance. Real-Time Retail Customer Analytics

• Developed a real-time analytics platform for a retail client, capturing clickstream and transaction data from online stores.

• Data was streamed through Kafka into Azure Data Lake Storage, processed with PySpark, and modeled in Synapse Analytics.

• Designed optimized schemas for churn prediction and sales attribution, enabling actionable insights for marketing campaigns.

• Tableau dashboards visualized customer behavior trends, helping improve ROI, retention, and inventory planning.

• The system handled over 200K events per minute with high reliability. CERTIFICATIONS

• Databricks Certified Data Engineer Associate Cert Prep:5 Data Governance

• Fundamentals of Data Transformation for Data Engineering

• Microsoft Azure Data Engineer Associate (DP-203) Cert Prep:by Microsoft Press

• Snowflake SnowPro Core Cert Prep

Contact this candidate