Machine Learning Data Engineer

Location:

Irving, TX

Salary:

85000

Posted:

June 30, 2025

Contact this candidate

Resume:

Narendra

469-***-**** ******************@*****.*** LinkedIn Irving, TX (Open to relocate)

SUMMARY

● Results-driven Data Engineer and Machine Learning Specialist with 4+ years of combined experience across global enterprise environments

● Expertise in designing scalable ETL pipelines, cloud-native data solutions, and ML model development using Azure Data Factory, Databricks, AWS Glue, Snowflake, and TensorFlow.

● Proven ability to manage end-to-end data engineering lifecycles including data ingestion, processing, modeling, deployment, and visualization across hybrid cloud architectures (Azure, AWS).

● Hands-on experience with Azure Synapse, Azure ML Studio, and Snowflake, enabling the creation of highly available and real-time analytical ecosystems.

● Developed, trained, and deployed machine learning models for demand forecasting, churn prediction, and anomaly detection using PySpark, scikit-learn, and XGBoost.

● Strong background in Data Wrangling, Feature Engineering, Data Lakes, Delta Lake, and SQL-based data modeling with a keen eye for performance tuning.

● Created advanced business dashboards and data stories using Power BI, Tableau, and Excel, supporting executive decision-making.

● A strategic contributor in Agile teams with cross-functional collaboration across business, data science, and engineering units.

● Recognized for transforming complex business challenges into data-first solutions that improve delivery speed, reduce operational costs, and increase decision confidence.

● Holds certifications in Snowflake, Databricks, Google Analytics, IBM BI, and advanced Machine Learning modules.

SKILLS

Languages: Python, SQL, R, SAS

ML Libraries & Frameworks: Scikit-learn, XGBoost, TensorFlow, Pandas, NumPy, PySpark, Matplotlib, Seaborn, ggplot2

Data Engineering Tools: Azure Data Factory, Azure Synapse, Databricks, AWS Glue, SSIS, Alteryx, Informatica Databases: Snowflake, Azure SQL, MySQL, SQL Server, PostgreSQL, MongoDB Cloud & Platforms: Azure (Synapse, ADF, ML Studio, Blob Storage), AWS (Redshift, S3, Lambda, EC2), GCP

(BigQuery - basic)

Visualization & Reporting: Power BI, Tableau, Excel (Power Query, Pivot Tables) Version Control & DevOps: Git, GitHub, Azure DevOps, CI/CD pipelines Other: Data Modeling, Data Lakes, Delta Lake, EDA, Feature Engineering, Agile, SCRUM, Data Governance PROFESSIONAL EXPERIENCE

SIMFORM USA May 2024 – Present

Data Engineer

● Architected enterprise-grade ETL pipelines in Azure Data Factory and Databricks, ingesting data from multiple sources into Snowflake and Synapse for centralized analytics.

● Designed and trained supervised ML models using Azure ML Studio and scikit-learn for demand forecasting and customer segmentation, improving forecast accuracy by 18%.

● Built real-time reporting dashboards in Power BI integrated with Azure SQL Database, driving executive- level analytics with near-live insights.

● Developed feature stores using Databricks Delta Lake and managed feature versioning for reuse across multiple ML pipelines.

● Leveraged Azure DevOps to manage code versioning, pipeline automation, and testing workflows across development and production environments.

● Enhanced existing Spark jobs in PySpark to scale and optimize large data workloads by reducing runtime by 30%.

● Collaborated with BI teams and analysts to deliver reporting data marts that supported visualizations and self-service queries.

● Created CI/CD pipelines for automated model deployment and scoring, improving ML model time-to- market.

● Conducted data profiling and wrangling from Azure Blob Storage and on-prem sources to ensure clean and reliable data flow.

● Led ML lifecycle governance practices by versioning, testing, and logging all model experiments and artifacts.

FULLSTACK LABS USA July 2023 – Dec 2023

Data Engineer (Intern)

● Built and deployed ETL processes using SSIS and AWS Glue to integrate structured and semi-structured data from MongoDB, PostgreSQL, and on-prem ERP systems.

● Collaborated with the data science team to develop and deploy machine learning models for customer churn and fraud detection using XGBoost and TensorFlow on AWS EC2.

● Orchestrated model pipelines for batch inference and deployed model APIs using Flask and Docker.

● Created interactive Tableau dashboards to track model performance, business KPIs, and real-time anomaly detections.

● Used Python, Pandas, and SQL to perform EDA, feature engineering, and outlier treatment on complex datasets.

● Delivered insights to leadership by converting analytical findings into executive presentations.

● Automated data flows using AWS Lambda and S3, integrating with QuickSight for rapid reporting and cost efficiency.

● Migrated traditional Excel-based reports to dynamic dashboards, reducing manual reporting time by over 40%.

● Standardized data quality rules and validations to ensure robust data pipelines.

● Participated in cross-team workshops to align ML workflows with enterprise data strategies. CYGNET DIGITECH India Oct 2021 – July 2022

Data Analyst

● Led the migration of a legacy enterprise data warehouse to Snowflake and Google BigQuery, reducing report execution time by 50% and storage costs by 30%.

● Developed Airflow DAGs to automate ELT processes and ensure reliable orchestration across Snowflake and BigQuery environments.

● Created high-throughput ingestion pipelines using Google Cloud Storage, Dataflow, and Cloud Functions to load and process structured and unstructured data.

● Used Python and PySpark to perform data transformation, deduplication, and cleansing on raw telecom datasets prior to staging.

● Designed star-schema and snowflake-schema data models in BigQuery and Snowflake for sales and billing analysis, enhancing business query performance.

● Partnered with QA and DevOps teams to ensure CI/CD integration, version control, and regression testing for pipeline deployments.

EDUCATION

University of North Texas Denton, TX

Master of Science, Information Science, May 2024

Anna University Chennai, IN

Bachelor of Technology – Computer Science Apr 2021

Contact this candidate