Vishal Reddy
TN, USA +1-901-***-**** ******.******@*****.*** LinkedIn
SUMMARY
Data Engineer with 4 years of experience in healthcare and financial services, specializing in data governance, metadata management, and master data initiatives. Proven expertise in designing data models, establishing data lineage, creating data dictionaries, and standardizing processes to support enterprise-wide decision-making. Adept in SQL, Python, Power BI, AWS, and Snowflake with a passion for data quality, compliance (HIPAA/GDPR), and cross-functional collaboration in Agile environments.
CORE SKILLS
Programming & Analytics: Python, SQL, Scala, R, SAS, SPSS, Alteryx, DBT
Data Modeling & Transformation: T- SQL, DAX, Pandas, NumPy
ETL & Data Orchestration: Apache Airflow, AWS Glue, Azure Data Factory, Databricks, Informatica, Talend, SSIS, SSRS
Data Warehousing & Databases: Snowflake, Amazon Redshift, PostgreSQL, MySQL, Oracle, MongoDB
Big Data & Processing: Apache Spark, PySpark, Hadoop, MapReduce, HDFS, Hive, Kafka
Machine Learning & NLP: Scikit-learn, TensorFlow, MLflow, GPT, LLMs, spaCy
Cloud Platforms: AWS (S3, EC2, Lambda), Azure (Data Lake, Synapse, ADF)
Data Governance & Management: Master Data Management (MDM), Data Lineage, Metadata Repositories, Data Dictionaries, Data Quality Frameworks, Compliance (HIPAA, GDPR), Business Glossary, IAM, KMS
Model Deployment & Serving: Kubeflow, Docker, Kubernetes
MLOps & CI/CD: Jenkins, GitHub Actions, DVC, TensorFlow Extended (TFX)
Monitoring & Data Quality: AWS CloudWatch, Great Expectations, dbt tests
Visualization & Reporting: Power BI, Tableau, Looker, Qlik Sense, QlikView, Qlik Sense, Excel
Project Management & Collaboration: Agile, SDLC, Jira, Confluence, Git, Jupyter Notebook
Soft Skills: Time Management, Communication, Team & Cross-team Collaboration
CERTIFICATIONS
•Smart Devices and Mobile Emerging Technologies by Yonsei University (Link)
•Data Visualization by University of Illinois (Link)
•Database Management Essentials by University of Colorado (Link)
•Software Development Process and Methodologies by University of Minnesota (Link)
•Object-Oriented Design by University of Alberta (Link)
•Professional Development: Improve yourself, always (Link)
EDUCATION
Master of Science in Computer Science Jan 2023 - Dec 2024
The University of Memphis – USA
Bachelor of Computer Science Jun 2018 -May 2022
Gandhi Institute of Technology & Management – India
WORK EXPERIENCE
Elevance Health, USA Data Engineer Oct 2024 – Present
Developed ETL pipelines using Azure Data Factory and Apache Airflow to integrate claims, member, and financial datasets, supporting real-time dashboards and regulatory reporting.
Enabled real-time KPI tracking for care quality and member retention by integrating LLM-based summarization into Kafka pipelines, reducing manual review time by 60%.
Designed analytical datasets in Snowflake and Redshift to support self-service reporting for operations and finance teams, reducing ad-hoc data requests by 40%.
Automated model retraining workflows with Airflow and Jenkins, supporting continuous learning and improving model accuracy by 15%.
Established lineage and mapping for key healthcare data sources used in Qlik Sense, Power BI, and Snowflake, improving traceability and audit readiness.
Collaborated with analysts to define data requirements, resulting in the creation of accurate and actionable data marts.
Built HIPAA- and GDPR-compliant data workflows supporting analyst access and reporting using IAM, KMS, and policy-based access control.
Partnered with product and actuarial teams to deliver insights from structured/unstructured data, including cost drivers and plan utilization trends.
Integrated ML models into production by deploying via Docker, Kubernetes, and MLflow, improving inference speed and scalability.
Optimized stored procedures and reporting queries in Redshift to reduce dashboard load times by 25%, improving executive report usability.
Led bi-weekly data quality assessment sessions using automated checks in Alteryx and DBT.
Hexaware Technologies, India Data Engineer Jan 2020 – Jan 2023
Drove metadata management and lineage mapping for over 100 financial data elements, improving transparency across investment risk reporting systems.
Engineered scalable ETL pipelines to prepare structured and unstructured datasets for machine learning models, reducing data preparation time by 40%.
Developed and enforced data quality dashboards and exception reports in Power BI for financial control teams, reducing validation cycles by 30%.
Created centralized data dictionaries and mapped source-to-target lineage across SSIS and Talend-based pipelines to meet audit and compliance standards.
Conducted cross-functional data quality assessments with front-office and operations teams; defined remediation workflows for inconsistent or missing master data.
Led Agile ceremonies and maintained JIRA boards to track data issues, quality initiatives, and metadata curation efforts.
Authored complex DAX and T-SQL queries, leveraging advanced techniques including joins, subqueries, CTEs, and indexing.
RELATIVE PROJECTS
Claims Data Pipeline & Analytics
Designed and built an end-to-end ETL pipeline using Apache Airflow and Python to ingest, transform, and load synthetic healthcare claims data into Snowflake.
Implemented data quality checks with DBT and automated validation scripts, reducing errors in processed data
Developed interactive dashboards in Power BI to track KPIs such as claim approval rates, cost trends, and patient retention.
Ensured data compliance by applying role-based access controls and encryption policies simulating HIPAA/GDPR standards.