Purna Bhuvaneswari Devi Akkala
USA +1-205-***-**** ***********.****@*****.*** LinkedIn
SUMMARY
Data Engineer with nearly 4 years of experience building production-grade data pipelines using Python, SQL, and PySpark. Skilled in ETL orchestration (Azure Data Factory), Azure cloud services (Azure Data Lake, Azure Synapse), and data warehousing (Azure Synapse, SQL Data Warehouse). Delivered secure, scalable data solutions across healthcare, finance, and manufacturing, enabling analytics teams to drive decisions from 20M+ row datasets. Strong track record in pipeline optimization, data quality automation, and cross-team collaboration to support reporting, compliance, and business performance goals. EDUCATION
Master of Science in Computer Science Dec 2024
University of Alabama at Birmingham, Birmingham, AL GPA: 3.90/4.0 Bachelor of Technology in Computer Science May 2021 KL University Vijayawada, India GPA: 8.50/10.0
TECHNICAL SKILLS
Programming & Query Languages: Python, PySpark, SQL, Scala, Java, R Data Engineering & ETL: Azure Data Factory (ADF), Azure Synapse Analytics, Apache Airflow, Azure Data Lake Storage Gen2, Data Ingestion, Data Transformation, Workflow Orchestration
Data Warehousing & Modeling: Snowflake, Azure Analysis Services (AAS), Tabular Editor, Star & Snowflake Schema Design, Semantic Layer Modeling, Cube Optimization
Data Visualization & Reporting: Power BI (DAX, Calculation Groups, Time Intelligence), Tableau, Google Looker Studio (Data Studio), Microsoft Excel
Cloud Platforms: Microsoft Azure (ADF, Synapse, AAS, ADLS), Google Cloud Platform (GCP – Compute Engine). AI & Machine Learning Exposure: Conversational AI, Prompt Engineering, LLM Optimization, Cursor AI, CNN Model for Image Classification
(Python)
Tools & Development Platforms: Git, Jupyter Notebook, Anaconda, Visual Studio Code, Eclipse IDE, Selenium Automation Data Governance & Security: Role-Based Access Control (RBAC), Data Confidentiality, Compliance in Enterprise Environments Collaboration & Agile Delivery: Cross-functional Collaboration, Requirement Gathering, Stakeholder Engagement, Agile Delivery, Release Validation, UAT Support
PROFESSIONAL EXPERIENCE
Dynetics, Inc., USA Aug 2024 - Current
Data Engineer
• Developed distributed Spark jobs using Azure Synapse Spark Pools to process large-scale sensor data (5–10TB/week), supporting advanced analytics that improved manufacturing defect detection by over 30%.
• Built streaming data pipelines using Azure Event Hubs and Azure Stream Analytics, cutting SCADA event ingestion latency from 12 minutes to 90 seconds, improving real-time visibility into production operations.
• Automated KPI reporting using Power BI and Azure Synapse with DAX and stored procedures, reducing Excel-based workflows by 85% and time-to-insight by over 50% across plant operations.
• Integrated MES and IoT telemetry sources under a unified schema within Azure Data Lake, improving stream and batch compatibility and reducing null value issues by 52%.
• Applied Azure RBAC and Purview for fine-grained access control over 100M+ machine records, ensuring HIPAA-compliant data governance and secure sharing across teams.
• Created monitoring scripts using Python and Azure Monitor Logs (KQL) to track data pipeline health and alert anomalies via Azure Notifications, reducing ETL job failures by 60%.
• Collaborated with embedded systems and firmware teams to optimize timestamp alignment in streaming data flows, improving synchronization across edge devices and data lake tables by 34%. Infosys Limited, India Aug 2021 - Dec 2022
Data Analyst / Data Engineer
• Developed distributed Spark jobs using Azure Synapse Spark Pools to process large-scale sensor data (5–10TB/week), supporting advanced analytics that improved manufacturing defect detection by over 30%.
• Developed modular PySpark scripts to parse and normalize 835/837 EDI files, enhancing claims data accuracy, and reducing manual data cleansing efforts by 42% across end-to-end ingestion workflows.
• Delivered Power BI dashboards using parameterized DAX and row-level security, ensuring HIPAA-compliant access and enabling role-specific, interactive analytics for five business units with zero data exposure violations.
• Built automated SQL validations for Synapse staging layers, catching anomalies pre-load and reducing reprocessing by 30%, which improved reporting availability and cut downstream QA revision time.
• Applied Great Expectations with Python to flag over 600 ingestion anomalies, preventing downstream model disruptions and enhancing data trust for operational, clinical, and regulatory reporting processes.
• Collaborated with QA and product teams to validate data models, increasing semantic accuracy and reducing dashboard query rewrites by 28%, improving usability across business and clinical functions.
• Optimized partitioning and indexing strategy for Synapse in collaboration with Azure architects, improving dashboard refresh times and API response speeds by 40%, enhancing reporting performance significantly. Deloitte, India Mar 2020 - Jul 2021
Data Engineer - Associate
• Built batch ETL pipelines using Talend, Python, and Azure SQL to process semi-structured sales data, improving refresh latency by 60% for retail performance reports.
• Migrated legacy on-prem Hadoop jobs to Azure HDInsight, cutting infrastructure overhead by 27% and improving SLA adherence on overnight batch pipelines.
• Designed audit-compliant data marts in Azure Synapse with dynamic data masking and RBAC, eliminating all compliance violations flagged in internal Q2 audit.
• Created interactive dashboards in Tableau using live connections to Azure SQL Database, enabling real-time visibility into shipping delays and reducing escalations by 22%.
• Built and maintained CI/CD pipelines using Azure DevOps, integrating data workflows with Git and automated deployment validations to reduce release rollbacks and downtime.
PROJECTS
Healthcare Claims Analysis
• Analyzed over 2 million healthcare claim records using SQL and Excel to detect anomalies, streamline data pipelines, and reduce inconsistencies impacting insurance payouts and operational efficiency.
• Built Power BI dashboards highlighting fraud-prone providers, claim frequency trends, and regional inefficiencies, delivering actionable insights for senior stakeholders across compliance, risk, and operations teams.
• Collaborated with analysts and data teams to translate insights into business strategies, contributing to $50,000 in measurable cost optimization and enhancing claims auditing processes through automated data quality checks. Sales Performance Dashboard
• Developed an interactive Power BI dashboard to visualize sales metrics across geographies, implementing real-time filters and drill-downs for actionable insights across executive and regional management teams.
• Engineered advanced DAX calculations and dynamic measures to track conversion rates, YoY revenue growth, and lead pipeline effectiveness, aligning business goals with real-time sales performance indicators.
• Integrated Excel-based data sources into Power BI models, automating refresh schedules and reducing manual reporting efforts by 60%, significantly improving data accessibility and stakeholder response time. CERTIFICATIONS
Data Science with Python - Simplilearn
Architecting with Google Compute Engine - Coursera ServiceNow Certified System Administrator
Automation Anywhere University Certified Advanced RPA Professional ACHIEVEMENTS
Certificate of Excellence Awarded by Infosys Limited Recognized for exceptional performance and contributions during tenure at Infosys Limited. Certificate of Appreciation for Outstanding Contribution Awarded by Infosys Limited Honored for significant contributions and commitment to excellence while working at Infosys Limited.