MANASWINI NARA Data Engineer
MO, US +1-317-***-**** *************@*****.*** LinkedIn
SUMMARY
Data Engineer with 3+ years of experience improving how data is collected, processed, and delivered for reporting and decision-making. Focused on building reliable systems that reduce delays, prevent issues, and support day-to-day operations. Experienced in handling large volumes of data, ensuring accuracy, and working closely with cross-functional teams. Strong track record of improving data reliability, minimizing manual work, and helping business teams get the information they need on time.
TECHNICAL SKILLS
Languages: Python, SQL, Bash
ETL & Orchestration: Apache Airflow, dbt, AWS Glue, Cron Data Processing: Apache Spark, PySpark, Pandas
Databases: PostgreSQL, MySQL, Redshift, BigQuery, Snowflake Cloud Platforms: AWS (S3, EC2, Glue, Redshift), Azure (Data Factory), GCP (BigQuery) Data Modeling: Star & Snowflake Schema, Dimensional Modeling File Formats: Parquet, Avro, JSON, CSV
Version Control & DevOps: Git, GitHub, Docker (basic), CI/CD (GitHub Actions) Monitoring & Testing: CloudWatch, Great Expectations, Data Pipeline Unit Testing Security: IAM, Encryption, Data Masking, HIPAA/GDPR Awareness BI Tools (Basic): Power BI, Tableau
Soft Skills: Agile, Communication, Documentation
PROFESSIONAL EXPERIENCE
Humana, MO, US Jan 2025 - Current
Data Engineer
Built Airflow and AWS Glue pipelines for Redshift, reducing batch failures by 45% and ensuring consistent daily processing of clinical data across patient metrics and claims workflows.
Created modular dbt models for fact tables, reducing SQL duplication by 70% and enabling faster updates to metric logic across multiple stakeholder-facing dashboards.
Refactored ETL jobs using PySpark on EMR, optimizing large Parquet transformations and cutting execution time by 60% on daily data processing.
Configured CloudWatch alarms and integrated SNS alerts, reducing incident response time from 90 to 50 minutes for critical overnight job failures.
Applied IAM role controls and column masking on sensitive fields, maintaining HIPAA compliance across S3 data zones and Redshift production queries.
Developed validation checks using Great Expectations, catching schema mismatches and preventing 30% of data issues before reaching downstream reporting layers.
Hexaware Technologies, India Feb 2021 - Jul 2023
Associate Data Engineer
Automated ETL from MySQL to Snowflake using Python and Cron, improving daily refresh reliability to 98% and removing need for manual updates.
Managed 40+ Airflow DAGs to ingest JSON and CSV, reducing end-to-end data load time by over 3 hours across environments.
Tuned PySpark joins and aggregations for structured files, reducing runtime by 3x and lowering cluster resource usage across staging workloads.
Built dimensional star schemas for finance data, improving Power BI and Tableau dashboard query speeds by 50% on aggregated metrics.
Wrote SQL and Pandas validation routines to detect nulls, missing fields, and schema mismatches, preventing 30+ downstream data incidents.
Supported Azure Data Factory and BigQuery workflows to process 20TB+ vendor data, managing schema evolution and versioning across ingestion cycles.
EDUCATION
Master of Science – Computer Science May 2025
Missouri University of Science and Technology – Rolla, MO, US CERTIFICATION & COURSES
Java and Python Training - Spoken Tutorial Project, IIT Bombay Python Programming- Coursera Certification Programming with Cloud IoT Platforms - Pohang University of Science and Technology (Coursera) Interactive Web Design Participant - GIET, Sep 2019