Manisha Nampally
Email: *****************@*****.*** Phone: 514-***-****
LinkedIn: www.linkedin.com/in/manisha-nampally-92ab213a4
Professional Summary
Data Engineer with 6+ years of experience building scalable big data infrastructure and analytics platforms using Apache Spark (PySpark, Spark SQL) and distributed data processing frameworks. Proven expertise in designing batch and near real time pipelines processing high volume datasets (20M–500M+ daily records) to support compliance monitoring, revenue analytics, KPI reporting, and product performance insights. Experienced in designing high performance data structures and semantic data layers to support business intelligence analytics and multi dimensional reporting.
Strong background in dimensional data modeling, data warehousing, and performance optimization to ensure high availability, resiliency, and SLA adherence. Experienced in partnering with Product, Finance, Risk, and Business stakeholders to translate complex requirements into reliable data solutions, dashboards, and actionable insights.
Hands on expertise in Python, SQL, workflow orchestration (Airflow), CI/CD automation, and BI reporting tools, with a strong focus on scalable system design, data quality, observability, and long-term platform reliability.
Tools & Technologies
Programming & Query Languages
Python
SQL (Advanced)
PySpark
Spark SQL
Big Data & Distributed Processing
Apache Spark
Distributed Data Processing
Batch & Near Real Time Processing
Data Partitioning & Performance Optimization
Delta Lake
Data Warehousing & Modeling
Dimensional Modeling (Star Schema, SCD Type 2)
Data Warehousing Concepts
Snowflake
Azure Synapse
Workflow Orchestration
Apache Airflow
Azure Data Factory
Cloud Platforms
Microsoft Azure (Databricks, ADLS Gen2)
AWS (S3, Glue-working knowledge)
BI & Analytics
Power BI
KPI & Funnel Analytics
Data Mining & Ad-hoc Analysis
DevOps & CI/CD
Azure DevOps
Git
CI/CD Pipelines
Infrastructure Automation
Monitoring & Data Quality
Data Validation Frameworks
Logging & Observability
SLA Monitoring
Educational Background
- Bachelors in Information Technology- JNTUH College of Engineering Hyderabad, Telangana, India.
Professional Experience
TELUS Health-ON
Azure Data Engineer
April 2024 – Present
Project: Telecom Compliance & Revenue Analytics Platform
Description:
Designed and scaled distributed Spark based data pipelines powering telecom subscriber analytics, revenue compliance monitoring, and churn analysis across 500M+ daily billing and usage records.
- Architected scalable Spark (PySpark + Spark SQL) pipelines processing 20M–500M+ telecom billing and subscriber records daily, improving data reliability and reducing failure rates by 35%.
- Designed dimensional models (Star Schema, SCD Type 2) to support compliance reporting, revenue assurance, and subscriber lifecycle analytics, improving query performance by 40%.
- Built batch and near real-time data workflows to support telecom revenue leakage detection and campaign performance tracking, enabling business teams to reduce revenue discrepancies by $1M+ annually.
- Partnered with Product Managers and Finance stakeholders to define KPIs including ARPU, churn rate, subscriber growth, and conversion metrics, translating business needs into scalable data architecture.
- Automated CI/CD deployment pipelines using Git and Azure DevOps, reducing release deployment time by 50% and improving production stability to 99.9% SLA adherence.
- Implemented data quality, monitoring, and observability frameworks, reducing data incidents by 30% across distributed billing systems.
- Developed interactive Power BI dashboards enabling stakeholders to analyse subscriber funnel metrics, segmentation, campaign performance, and revenue growth trends.
- Conducted ad hoc data mining and analysis using Spark SQL and notebooks, identifying churn driving patterns that improved targeted retention campaigns by 15%.
- Optimized partitioning and file compaction strategies in Delta Lake, improving large-scale query performance by 45%. Engineered high performance, multi dimensional data models (Star Schema, SCD Type 2) for telecom and banking analytics, supporting 500M+ daily records and improving complex BI query performance by 45–60%.
- Designed denormalized and aggregated data structures optimized for OLAP style reporting, reducing dashboard load times from 20s to under 5s.
Environment:
Big Data & Distributed Systems: Apache Spark, PySpark, Spark SQL, Delta Lake, Azure Databricks
Data Warehousing & Analytics: Snowflake, Azure Synapse Analytics, Star Schema/SCD Type 2 modeling
Workflow Orchestration & ETL: Apache Airflow, Azure Data Factory
Cloud Platforms: Microsoft Azure (Databricks, ADLS Gen2), AWS (S3, Glue working knowledge)
Programming & Scripting: Python, SQL, Scala (working knowledge)
BI & Reporting Tools: Power BI, Tableau
CI/CD & Version Control: Azure DevOps, Git, Automated Pipeline Integration
Monitoring & Quality Frameworks: Data validation & reconciliation frameworks, SLA monitoring, logging, root cause analysis
---
RBC Bank-ON
BI Analyst/Power BI Developer
October 2021– May 2024
Description: I designed, developed, and optimized enterprise scale financial data pipelines and analytical solutions, processing 500M+ banking records daily across accounts, loans, and credit portfolios. I implemented cloud first data orchestration, automated reconciliation workflows, and robust governance frameworks to ensure accuracy, regulatory compliance, and operational efficiency.
- Developed automated reconciliation and anomaly detection pipelines in Python and SQL for daily transaction feeds, reducing manual validation effort by 65% and preventing $1M+ potential reconciliation discrepancies.
- Implemented dynamic risk assessment dashboards in Power BI, integrating credit exposure, delinquency, and portfolio metrics, improving decision making speed for senior management by 40%.
- Built automated ETL monitoring and alerting using Azure Data Factory and Airflow, proactively detecting failed loads and ensuring 99.9% reporting SLA compliance.
- Optimized high volume SQL transformations and aggregations for month end close, cutting report generation time from 8 hours to 3 hours (63% faster).
- Developed metadata driven ETL frameworks enabling consistent handling of new financial data sources, reducing onboarding time from 2 weeks to 4 days.
- Designed role based data access and PII masking policies for 2000+ internal users, ensuring compliance with privacy standards and audit requirements.
- Conducted lineage tracking and impact analysis across ETL pipelines, enabling faster root cause resolution for data discrepancies and reducing downtime by 30%.
- Implemented historical data archival and partitioning strategies, improving query performance by 50% while reducing storage costs by 20%.
- Collaborated with finance, risk, and audit teams to develop automated reporting templates aligned with Basel III and IFRS, improving compliance review efficiency by 35%.
- Developed self service analytics datasets for branch and corporate finance teams, cutting dependency on IT for ad hoc queries by 30% and enabling faster data driven decisions.
- Led knowledge sharing sessions on SQL optimization, ETL automation, and dashboard design, improving team skill adoption and reporting quality.
Environment: Azure Data Factory, Azure SQL Data Warehouse, SQL Server, Oracle, Databricks, Python, Power BI, Airflow, Azure DevOps, Git
---
Maxso Technologies – India
Client: DHL
ETL Analyst/Developer
June 2019 – August 2021
Description: Worked as an ETL Analyst/Developer for DHL, responsible for designing, developing, and maintaining high volume ETL pipelines and data warehouse solutions. Worked on end to end design, development, and optimization of high volume ETL pipelines for operational and financial logistics data. Processed 10M+ shipment and billing records daily, ensuring accurate reporting, KPI tracking, and SLA compliance. Focused on automation, reconciliation, and analytics to improve operational decision making and reduce manual intervention. Developed batch ETL workflows processing 10M+ shipment and billing records daily into SQL Server and Teradata data warehouses, supporting operational and financial reporting.
- Implemented automated Python/SQL reconciliation frameworks for shipment, delivery, and billing data, reducing manual verification by 60% and preventing $500K+ potential billing errors.
- Developed real time exception monitoring and alerting for ETL pipelines, improving SLA adherence from 95% to 99% for daily reporting workflows.
- Designed metadata driven ETL frameworks to handle schema changes and source evolution automatically, reducing manual updates by 50% and improving team productivity.
- Built dynamic Tableau dashboards integrating logistics, billing, and delivery KPIs, enabling operations teams to identify delays and route inefficiencies, improving delivery performance by 25%.
- Optimized batch and incremental ETL strategies, reducing pipeline runtime by 50% for daily shipment and billing processing.
- Developed cost efficient data storage strategies, including partitioning and delta loading, reducing storage costs by 15% while maintaining performance.
- Conducted data lineage and audit tracking for all ETL processes, enabling end to end traceability and SLA compliance for operational reporting.
- Collaborated with cross functional finance, logistics, and operations teams to standardize KPIs and deliver actionable insights for invoicing, route efficiency, and hub throughput metrics.
- Implemented data quality and anomaly detection rules in PySpark and SQL, reducing reporting errors by 35% and ensuring trusted operational metrics.
Environment: SQL Server, Teradata, Oracle, SSRS, Tableau, Python, PySpark, ETL workflows, Data Warehousing, Batch Processing, Airflow, Delta Lake