Post Job Free
Sign in

Data Engineer Supply Chain

Location:
Hopkinton, MA
Posted:
August 20, 2025

Contact this candidate

Resume:

PRAVEENA

U.S. Permanent Resident (Green Card Holder)

Hopkinton, MA

linkedin.com/in/praveena-krishnakumar

781-***-**** *******@*****.***

Accomplished Data Engineering Lead with 10 years of experience across ETL development, data migration, cloud data platforms, and analytics. Specialized in Snowflake and Informatica (PowerCenter and IDMC), with a strong track record across food supply chain, healthcare, energy, and academia sectors. Skilled at managing high-volume data pipelines and leading complex system migrations to cloud environments. Passionate about enabling strategic business decisions through modern data architectures and governance best practices.

TECHNICAL COMPETENCIES

Databases - Oracle, SQL Server

Data Warehousing Tools – Snowflake, Datalake

ETL Tools – Pentaho, Ssis, Qlik, IDMC, Informatica, Airbyte, BryteFlow, Power BI, Oracle Reporting Analytics, Tableau, AWS Glue, Azure Data Factory

Cloud – AWS, Azure, GCP, OCI,

SCM – Git, Bit bucket, Copilot, Source tree

Deployment Tool – Liquibase

Scripting Languages - Shell scripts, Batch Script, Python

Replication Tools - Oracle GoldenGate, Fivetran, Striim, Snaplogic, Dbt

Scheduling/Monitoring - Paessler PRTG Network Monitor

Agile Tools - JIRA, Confluence

Other Tools - Oracle Enterprise Manager (OEM), Microsoft Visio, TOAD Data Point, AEM, Service Now, Office 360, Sharepoint,

Authentication tools: Auth0, OKTA, AWS Cognito, Duo

Other Tools: Kubernetes

EXPERIENCE

HARVARD (IT) UNIVERSITY

Senior Data Engineer /Data Architect, Cambridge, MA DEC 2024 - Present

Led the migration of ETL processes from Informatica PowerCenter to Informatica IDMC Cloud, aligning with enterprise cloud modernization goals.

Assessed and documented existing PowerCenter workflows, mappings, and dependencies for migration planning.

Developed and deployed end-to-end Snowflake pipelines for academic data domains, implementing clustering, data masking, and access control to optimize performance and security.

Built scalable, modular DBT models to transform raw ingestion data into analytics-ready datasets supporting academic reporting and finance dashboards.

Implemented DBT tests (not null, unique, accepted_values) and created exposures to track data lineage and downstream BI dashboard dependencies.

Collaborated with institutional data stewards to classify Snowflake data assets and apply metadata tags, ensuring alignment with Harvard’s data governance policies and enterprise standards.

Partnered with business stakeholders (registrar, finance, research admin) to gather reporting needs and translated them into technical DBT models and semantic data layers.

Created dashboards in Power BI and Tableau from DBT-generated Snowflake views, supporting department-level metrics and decision-making.

Applied Snowflake Row Access Policies and Dynamic Masking Policies to enforce FERPA/HIPAA compliance based on user roles.

Developed Python-based framework for automated data validation, comparing source-to-target data and sending alerts for inconsistencies.

Created knowledge-sharing documentation, data dictionaries, and user guides for both technical and non-technical teams.

Acted as a liaison between engineering and institutional research teams, ensuring Snowflake data was trustworthy, explainable, and timely for analysis.

YES ENERGY

Snowflake Cloud DBA Needham, MA September2019 – Nov 2024

Implemented data integration through various pipelines, managing data migration, warehousing, and ingestion projects

Optimized performance of Oracle databases by tuning queries and implementing faster data transfer methods, reducing load times by 40% and enhancing analytics capabilities.

Automated data pipelines using Python scripts, significantly reducing manual intervention by 20% and improving team productivity in managing cloud environments.

Implemented data validation and cleansing rules in ingestion pipelines to maintain high data quality and integrity.

Established access control mechanisms and role-based privileges in Snowflake to support compliance and data governance.

Led documentation of data sources, transformation logic, and lineage to improve transparency and support audits.

Supported data cataloging initiatives by tagging business-critical datasets and documenting usage patterns.

Built and maintained modular DBT models to transform raw data into analytics-ready datasets in Snowflake, improving pipeline maintainability and scalability.

Implemented DBT tests to enforce data quality checks on critical business tables, significantly reducing downstream data issues.

Leveraged DBT’s incremental models to optimize Snowflake compute usage, reducing data transformation costs by 20%.

Collaborated with data analysts to version-control DBT models in Git, ensuring consistency, traceability, and streamlined CI/CD integration.

Documented DBT models and created lineage graphs to enhance data discoverability and compliance for business users and governance teams.

Used DBT exposures to map and monitor critical data assets used by BI dashboards, supporting data governance and impact analysis.

Participated in DBT Cloud setup and job scheduling, ensuring seamless collaboration, job logging, and alerting for production pipelines.

Improved performance by optimizing complex SQL queries in Snowflake, and improved query execution times by 30%, resulting in faster analytics.

Designed and managed scalable cloud solutions using AWS infrastructure, including S3 for data storage and transfer, improving system efficiency and reducing storage costs by 15%.

Developed data pipelines with Qlik, Oracle golden gate, Striim, Bryteflow, Snaplogic, Ssis, Fivetran, Airbyte, and Pentaho, ensuring seamless data flow into Snowflake from various custom sources, resulting in a 25% improvement in data processing speed, leading to faster access to real-time insights and reduced time-to-decision for the business.

Developed and deployed monitoring solutions using PRTG and Opgenesis, significantly enhancing data visibility, improving system uptime by 15%, and reducing issue resolution times by 30%, leading to more efficient system performance and quicker troubleshooting.

Optimized data models in Snowflake, reducing query times by 40%, and developed custom SQL scripts to automate database maintenance. Leveraged Snowpipe to automate data ingestion in AWS and Azure, integrated Snowflake with BI tools, improving data visualization and accuracy by 20% through cross-functional collaboration, enhancing overall system efficiency.

Led a cross functional team that consisted of Engineering and Product Management teams on CI/CD deployments using Liquibase pipelines through Bitbucket with code repositories like VS/SourceTree, resulting in a 20% reduction in deployment time and improved code integration efficiency.

Managed 350+ cloud client accounts, optimizing workload structure, which resulted in a 30% increase in operational efficiency and reduced cloud costs by 25%. Successfully ensured seamless cloud operations, leading to 99.9% uptime for all clients.

Developed data pipelines using Snaplogic,Qlik, Bryteflow and other ETL tools that supported the Product and BI teams in creating real-time dashboards, enhancing data-driven decisions.

Worked with the team to create Root Cause Analysis (RCA) for any issues to ensure continuous improvement.

Developed Proof of Concept (POC) for new products, testing and validating new technologies and approaches to enhance data solutions.

HEALTH CARE FINANCIALS

Senior ETL analyst Quincy, MA November 2018 – September 2019

Worked on business transformation initiatives to enhance data warehousing and analytics capabilities, resulting in a 40% improvement in reporting accuracy and faster data access by 25%.

Managed end-to-end Pentaho ETL processes and troubleshooting, reducing data processing errors by 15% and improving efficiency by 20%.

Created functional documents and prepared system integration and user acceptance testing, reducing project onboarding time by 30% and increasing system reliability.

Analyzed business data, identifying gaps and ensuring alignment with business processes, which contributed to a 20% increase in overall operational performance.

Designed and developed a Test-Driven Development framework that reduced debugging time by 35% and enhanced code quality.

FSE NET

Senior ETL Developer Waltham, MA July 2016 – November 2018

Developed Kettle transformations and jobs for data publishing and reporting, which reduced report generation time by 40% and enhanced data accuracy, leading to more informed decision-making.

Created Jasper reports and deployed them in Apache Tomcat, improving report accessibility and reducing server downtime by 20%, enhancing overall system performance.

Designed and maintained file formats and templates for various manufacturers and recipients, streamlining communication processes and reducing data inconsistencies by 25%, leading to improved operational efficiency.

EDUCATION

MASTER IN INFORMATION SYSTEMS AND APPLICATIONS

Bharathidasan University

BACHELOR OF COMMERCE

Madras University



Contact this candidate