Data Engineer Governance

Location:

Legacy Town Center North, TX, 75024

Salary:

750000

Posted:

October 15, 2025

Contact this candidate

Resume:

Sadhvika Nelavelli Data Engineer

Location: TX, USA Phone: +1-806-***-**** Email: *******************@*****.*** LinkedIn SUMMARY

Data Engineer with 3+ years of hands-on experience designing and optimizing enterprise-scale data pipelines across cloud platforms. Specialized in building ETL/ELT solutions using Snowflake, Databricks, and Azure, with proven ability to reduce query latency by 60% and automate workflows processing 5TB+ daily. Demonstrated expertise in predictive modeling, data governance, and delivering compliance-ready solutions (HIPAA, GDPR) that drive measurable business impact in insurance and HR analytics domains. TECHNICAL SKILLS

Programming & Scripting: Python, SQL, T-SQL, R, Bash, Scala, Java Cloud & Data Platforms: Snowflake, Google BigQuery, Azure Synapse, SQL Server, MySQL, Oracle, MS Access, Google Cloud Storage, Azure Blob Storage, AWS Redshift, AWS S3, Databricks

ETL & Data Integration: DBT (Data Build Tool), Apache Airflow, PySpark, SSIS, Google Cloud Dataflow, Cloud Composer, Informatica, Talend, AWS Glue, MuleSoft Anypoint Platform, Mule ESB, REST APIs, SOAP APIs, Batch & Streaming Pipelines Data Modeling & Warehousing: Star Schema, Snowflake Schema, Dimensional Modeling, Data Vault, Normalized Modeling, Data Mart Design, OLTP

& OLAP Systems, ER Modeling

Business Intelligence & Reporting: Power BI, Tableau, Cognos, SSRS, Excel (Advanced), Google Data Studio, Looker, QlikView Libraries & Frameworks: Pandas, NumPy, Matplotlib, Seaborn, Scikit-Learn, TensorFlow, Keras, LangChain, NLTK, PyTorch Data Governance & Compliance: Data Lineage, Metadata Management, RBAC, IAM, HIPAA, GDPR, CCPA, Data Masking, Data Validation Frameworks, Master Data Management (MDM)

Machine Learning & Analytics: Predictive Modeling, Regression, A/B Testing, Hypothesis Testing, Statistical Analysis, Time Series Forecasting, NLP, Clustering, Feature Engineering, Model Deployment (MLOps) Big Data Ecosystem: Hadoop, Hive, Spark (Core, SQL, Streaming), Kafka, HDFS, Flink API & Integration Platforms: MuleSoft Anypoint Studio, MuleSoft API Designer, MuleSoft Exchange, API Management, Postman, Swagger (OpenAPI), Apigee, Dell Boomi

Other Tools & Concepts: API Integration, Event-Driven Architecture, Data Security & Privacy, Agile/Scrum, Data Orchestration, Metadata Catalogs, Data Observability

PROFESSIONAL EXPERIENCE

Data Engineer Allstate, TX, USA Jan 2025 – Present

• Architected PySpark and Airflow ETL pipelines processing 5TB+ daily claims data, reducing pipeline failures by 40% and improving data quality scores from 82% to 96% across enterprise analytics platforms

• Optimized Snowflake warehouse configurations and restructured dimensional models, reducing claims reporting query latency from 8 minutes to 3 minutes and enabling real-time underwriting decisions for 2,000+ agents.

• Automated AWS S3-to-Redshift data synchronization workflows, achieving <15- m i n u t e latency for fraud detection systems and improving incident detection rates by 28% across 12 business units.

• Deployed policy lapse prediction model (Random Forest, Scikit-Learn) with 87% accuracy, identifying 15,000+ at-risk policies monthly and supporting $2.3M in retained premiums through proactive outreach campaigns.

• Built 8 interactive Power BI dashboards tracking claims settlement velocity and policy renewal metrics, eliminating 120+ manual analyst hours monthly and reducing executive reporting cycle from 5 days to same-day delivery.

• Implemented column-level data masking and RBAC policies in Snowflake, ensuring 100% HIPAA compliance during Q1 2025 audit while maintaining data access for 300+ authorized users across 15 departments.

• Collaborated with product owners in bi-weekly Agile sprints, delivering 12 pipeline enhancements on schedule with zero production incidents over 3-month period.

Data Engineer Hexaware Technologies, India May 2021 – June 2023

• Designed Azure Data Factory ETL pipelines consolidating 250K+ employee records from 8 disparate HR systems (Workday, SAP, ADP) into centralized Databricks lakehouse, reducing data silos by 85%.

• Automated DBT transformation workflows and Databricks notebooks, processing 2M+ payroll transactions monthly with 99.7% accuracy while reducing manual data validation effort from 40 to 6 hours weekly.

• Migrated 3 legacy on-premise Oracle databases (2.5TB total) to Azure Synapse, implementing incremental load strategies that reduced refresh times by 70% (from 6 hours to 1.8 hours) and cut infrastructure costs by $18K annually.

• Created star schema data models in Azure Synapse supporting 15+ executive HR dashboards, improving workforce planning query performance by 65% and enabling drill-down analysis for 50+ HR business partners.

• Developed 6 Power BI dashboards visualizing attrition trends, diversity metrics, and recruitment funnel analytics, driving 22% improvement in executive decision-making speed per stakeholder surveys.

• Built Python-based anomaly detection framework identifying 95% of data quality issues before downstream consumption, increasing stakeholder confidence scores from 71% to 93% over 18 months.

• Deployed attrition prediction model (XGBoost, 83% precision) in Databricks ML, enabling proactive retention interventions for 300+ high-risk employees quarterly and reducing voluntary turnover by 14%.

• Partnered with the InfoSec team to implement RBAC across Azure Synapse and enforce PII data masking for 180K employee records, achieving 100% GDPR audit compliance in 2022 and 2023 assessments. EDUCATION

M.S. in Computer Science Texas Tech University May 2025 Bachelor of Technology in Computer Science Lovely Professional University, Punjab, India May 2023 SnowPro Core Certification – Snowflake Inc. dbt Fundamentals Certification – dbt Labs Google Cloud Professional Data Engineer – Google

Contact this candidate