Shivani B
Mobile: 078******** Email: *************@*****.*** Location: Coventry
Summary: Experienced Data Engineer and Analyst with deep expertise across Microsoft Azure and the wider Microsoft data platform, as well as metadata-driven architectures. Skilled in designing resilient Lakehouse solutions using Python, PySpark, and SQL, and delivering both batch and streaming pipelines with Azure Event Hubs. Proficient in dimensional modelling and Power BI, with hands-on experience in ADLS, Delta Lake, and Snowflake. Adept at translating business requirements into reliable, maintainable data products and collaborating effectively with cross-functional teams.
Career Summary
Strong grounding in DWH methods, including star/snowflake design and medallion layering. while maintaining transparency, auditability, and strong operational controls throughout delivery.
Built metadata driven pipelines, dataflows, and notebooks in SSIS, ADF, Synapse, Databricks, Python, PySpark, and SQL with automated deployments. with clear documentation, best practices.
Advanced Python, PySpark skills complemented by sound engineering practices around packaging, linting, testing, and CI/CD. by aligning to enterprise standards, security policies, and GDPR.
Delivered real time ingestion with Databricks Structured Streaming and Event Hubs for compliance and operations use cases. so, teams can support, evolve, and troubleshoot outcomes.
Optimized pipelines and Spark jobs with tuning, caching, broadcast joins, and partitioning to improve throughput and cost. to provide measurable value to stakeholders and sustain green SLAs.
Hands on with ADLS, Delta Lake, and OneLake, and confident writing and tuning performant SQL across large datasets. enabling governed self service analytics and trusted insights for data decision.
Technical Skills Summary
Cloud: Azure, Snowflake, AWS
BI & Data Analytics: Excel, SSRS, Tableau, Power BI, DOMO
Cloud Databases: Azure SQL DB, RedShift, Snowflake, Synapse Pools, Cosmos DB
ETL/ELT: SSIS, ADF, Databricks, Fabric, Synapse
Scripting Languages: SQL, KQL, Python, PySpark, Big Query
Data Governance: RBAC, RLS, OLS, Azure Purview
Automation: DevOps, CI/CD, Git, YAML, Azure Functions, Logic Apps, Power Automation
Others: Agile, Scrum, Waterfall, Jira, Office 365, Documentation
Professional Work Experience
Worked for Nationwide Building Society (Contract), From November’23 – Present
Created Application Interface Documents for downstream m processes to establish new interfaces for file transfer and reception through Azure Data Share, including creating pipelines, and complex data transformations using ADF, Python, PySpark, SQL, Databricks, ADLS and Delta Lake.
Develop batch processing solutions by using ADF, Fabric and Data bricks, worked with JOSN, Avro and Parquet files and converted the data from either format Parsed Semi Structured JSON data.
Ingested data in mini-batches and performs RDD transformations on those mini-batches of data by using Spark Streaming to perform streaming analytics in Data bricks, Synapses.
Used custom connectors in Databricks to integrate with various third-party APIs, NoSQL databases, Webpages, extending the ETL capabilities and enabling seamless data ingestion.
Utilized Azure Synapse, Databricks, Event Hub and PolyBase for seamless data transfer, enabling efficient movement of data between different systems and achieving streamlined data integration.
Used Azure Key vault as central repository for maintaining secrets and referenced the secrets.
Implemented monitoring and logging mechanisms within Azure Logic Apps to track workflow execution, trouble shoot issues and ensure the reliability and performance of data integration.
Integrated applications and services with Dataverse, Data Lake, Azure SQL DB, managing data and creating custom connectors to meet ETL/ELT project requirements for source integrations.
Developed interactive Power BI dashboards and reports, providing actionable insights and facilitating data-driven decision-making, and published into Power BI Services.
Designed and deployed Power Apps solutions, including Canvas and Model-Driven Apps, Power Automations to address specific business needs and improve operational efficiency.
Worked for Latrobe Health services, Duration: January’22 – September’22
Partnered with business leaders to shape data led solutions and improve project outcomes. while maintaining transparency, auditability, and strong operational controls throughout delivery.
Designed and integrated Azure ETL/ELT services to move data reliably from source to target. with clear documentation, structured reviews, and automated checks embedded in the lifecycle.
Built real time pipelines with Event Hubs and Databricks Structured Streaming, cutting processing latency and enabling timely ingestion on Azure. by aligning to enterprise standards, security, GDPR.
Developed ingestion patterns for Avro, Parquet, JSON, Hive, and ORC using Unity Catalog, ADLS, and Delta Lake for scalable governance. So teams can support, evolve, and troubleshoot outcomes.
Implemented API based integrations for batch and streaming via Event Hubs, Databricks, Python, Streaming Analytics, and Logic Apps. to provide measurable value to stakeholders and releases.
Configured Event Hubs triggers and tuned Databricks clusters to handle high volume, low latency streaming workloads. enabling governed self service reports and trusted insights for users.
Used Spark SQL, RDD transformations, DataFrames, and Pandas where appropriate to build robust pipelines and checks while maintaining transparency, auditability, and strong operational controls.
Established lineage and metadata with Purview across ADLS and Delta to deliver end to end visibility over pipelines with clear documentation, structured reviews, and automated checks.
Leveraged Python libraries (Pandas, NumPy, PyTest, scikit learn) for analysis, data quality, feature prep, and validation. by aligning to enterprise standards, security policies, and pragmatic cost.
Built secure, compliant pipelines on Fabric and Azure to ingest, land, and transform data into OneLake products to provide measurable value to stakeholders and sustain green SLAs releases.
Worked for Rail Cargo Australia, Duration: March’21 - November’21
Aligned with users on data requirements and delivered scalable, reliable solutions across multiple domains. enabling governed self service reports and trusted insights for business decision makers.
Performed large scale transformations in Databricks and Spark, optimizing PySpark with caching, partitioning, and fit for purpose cluster sizing while maintaining transparency, auditability.
Normalized complex semi structured files (JSON, XML, Parquet) into Azure SQL using Databricks, Synapse, T SQL, and Python with clear documentation, structured reviews, and automated checks.
Built Logic Apps for notifications and alerts across ADF, Synapse, SQL Database, and Logic Apps for proactive monitoring by aligning to enterprise standards, security policies, and pragmatic cost.
Used YAML to define stage specific ETL/ELT behavior supporting medallion layers with reusable configs for mapping and quality, evolve, and troubleshoot outcomes in production environments.
Contributed both to greenfield builds and mature tier 1 applications that enhanced financial and data solutions to provide measurable value to stakeholders and sustain green SLAs across releases.
Integrated Dataverse customer data into a unified source of truth for segmentation, targeting, and personalization. enabling governed self service BI and trusted insights for decision makers.
Automated data processes to improve accuracy and availability and supported an Azure based DWH and lake alongside partners while ensuring operational controls throughout delivery.
Produced functional, technical and non-technical specifications from business requirements. with clear documentation, structured reviews, and automated checks embedded in the lifecycle.
Designed and maintained Power BI datasets and reports exposing key financial and operational insights to end users by aligning to enterprise standards, security policies, and RLS.
Worked for Gateway Bank, Duration: June’20 – Febraury’21
Planned work with clients, end users, and applied Agile practices to turn requirements into working deliverables, support, evolve, and troubleshoot outcomes confidently in production environments.
Created parameterized Databricks notebooks, ADF pipelines triggered by Logic Apps, schedules, and blob file arrival events to provide measurable value to users and sustain green SLAs across releases.
Optimized data models, SQL queries, and storage choices for both performance and cost in cloud environments enabling governed data analytics and trusted insights for business decision makers.
Hardened data engineering processes using DevOps and built resilient solutions on prem and in Azure with ADF, Databricks and leverage Python, and PySpark, ADLS Medallion architecture.
Converted unstructured data into interactive reports with ADF, Power BI, Power Query, and Power Pivot for visibility with clear documentation, structured reviews, and embedded in the lifecycle.
Deployed Azure repositories and analytics models to support decision making. by aligning to enterprise standards, security policies, and pragmatic cost governance practices.
Gathered BI/reporting needs and translated questions into data solutions with clear, actionable insights. to provide measurable value to stakeholders and sustain green SLAs across releases.
Applied Lakehouse and DWH principles, modelling and schema design across various source datasets enabling data and business insights that helped stake holders in their decision making.
Implemented error and exception handling to surface data quality issues to DQ analysts and data architects while maintaining strong operational controls throughout project delivery.
Worked for Palmerbet, Duration: August’19 - April’20
Worked with SMEs to capture business and technical needs and convert them into practical designs with clear documentation, structured reviews, and automated checks embedded in the lifecycle.
Developed and supported data collection, integration, and ETL with SSIS and ADF to deliver data to key interfaces by aligning to enterprise standards, best practices, and automation.
Maintained operational reporting layers that met requirements for near real time use cases, so teams can support, evolve, and troubleshoot outcomes confidently in production environments.
Ensured accuracy and integrity across Power BI reports and the underlying models for consistent decision making. to provide measurable value to stakeholders and sustain SLAs across releases.
Built and supported intuitive dashboards in Power BI/Tableau and SSRS paginated reports surfacing KPIs. enabling governed self service analytics and trusted insights for business decision makers.
Advised on Power Platform best practices including CoE Starter Kit adoption and common admin tasks while maintaining all the source systems and SSRS, Power BI, and Tableau servers.
Applied Power Query with query folding and reusable functions to improve overall data model performance by aligning to stakeholder need and business short term and long-term goals.
Used Power BI Service, Report Builder to publish reports and datasets across secured workspaces. So, teams can support, evolve, and troubleshoot outcomes confidently in production environments.
Implemented dataset security with row level security (RLS) and configured gateways and refresh schedules. to provide measurable value to stakeholders and sustain green SLAs across releases.
Educational Qualifications
Master's in business and management
Strathclyde Business School, Glasgow, UK (September 2022 – September 2023)
Bachelor's in electronic and communications engineering
CVR College of Engineering, Hyderabad, India (August 2015- April 2019)
Professional Certifications
Microsoft: Azure Fundamentals (AZ-900)
Microsoft: Azure Data Fundamentals (DP-900)
Microsoft: Power Platform Fundamentals (PL-900)
Other Details
Visa: Graduate Visa
Sponsorship: Required
Availability: Immediate
References: Available