Post Job Free
Sign in

Data Engineer Processing

Location:
India
Salary:
60000
Posted:
September 10, 2025

Contact this candidate

Resume:

MANASWINI NARA Data Engineer

MO, US +1-317-***-**** *************@*****.*** LinkedIn

SUMMARY

Data Engineer with 3+ years of experience improving how data is collected, processed, and delivered for reporting and decision-making. Focused on building reliable systems that reduce delays, prevent issues, and support day-to-day operations. Experienced in handling large volumes of data, ensuring accuracy, and working closely with cross-functional teams. Strong track record of improving data reliability, minimizing manual work, and helping business teams get the information they need on time.

TECHNICAL SKILLS

Languages: Python, SQL, Bash

ETL & Orchestration: Apache Airflow, dbt, AWS Glue, Cron Data Processing: Apache Spark, PySpark, Pandas

Databases: PostgreSQL, MySQL, Redshift, BigQuery, Snowflake Cloud Platforms: AWS (S3, EC2, Glue, Redshift), Azure (Data Factory), GCP (BigQuery) Data Modeling: Star & Snowflake Schema, Dimensional Modeling File Formats: Parquet, Avro, JSON, CSV

Version Control & DevOps: Git, GitHub, Docker (basic), CI/CD (GitHub Actions) Monitoring & Testing: CloudWatch, Great Expectations, Data Pipeline Unit Testing Security: IAM, Encryption, Data Masking, HIPAA/GDPR Awareness BI Tools (Basic): Power BI, Tableau

Soft Skills: Agile, Communication, Documentation

PROFESSIONAL EXPERIENCE

Humana, MO, US Jan 2025 - Current

Data Engineer

Built Airflow and AWS Glue pipelines for Redshift, reducing batch failures by 45% and ensuring consistent daily processing of clinical data across patient metrics and claims workflows.

Created modular dbt models for fact tables, reducing SQL duplication by 70% and enabling faster updates to metric logic across multiple stakeholder-facing dashboards.

Refactored ETL jobs using PySpark on EMR, optimizing large Parquet transformations and cutting execution time by 60% on daily data processing.

Configured CloudWatch alarms and integrated SNS alerts, reducing incident response time from 90 to 50 minutes for critical overnight job failures.

Applied IAM role controls and column masking on sensitive fields, maintaining HIPAA compliance across S3 data zones and Redshift production queries.

Developed validation checks using Great Expectations, catching schema mismatches and preventing 30% of data issues before reaching downstream reporting layers.

Hexaware Technologies, India Feb 2021 - Jul 2023

Associate Data Engineer

Automated ETL from MySQL to Snowflake using Python and Cron, improving daily refresh reliability to 98% and removing need for manual updates.

Managed 40+ Airflow DAGs to ingest JSON and CSV, reducing end-to-end data load time by over 3 hours across environments.

Tuned PySpark joins and aggregations for structured files, reducing runtime by 3x and lowering cluster resource usage across staging workloads.

Built dimensional star schemas for finance data, improving Power BI and Tableau dashboard query speeds by 50% on aggregated metrics.

Wrote SQL and Pandas validation routines to detect nulls, missing fields, and schema mismatches, preventing 30+ downstream data incidents.

Supported Azure Data Factory and BigQuery workflows to process 20TB+ vendor data, managing schema evolution and versioning across ingestion cycles.

EDUCATION

Master of Science – Computer Science May 2025

Missouri University of Science and Technology – Rolla, MO, US CERTIFICATION & COURSES

Java and Python Training - Spoken Tutorial Project, IIT Bombay Python Programming- Coursera Certification Programming with Cloud IoT Platforms - Pohang University of Science and Technology (Coursera) Interactive Web Design Participant - GIET, Sep 2019



Contact this candidate