Narendra
469-***-**** ******************@*****.*** LinkedIn Irving, TX (Open to relocate)
SUMMARY
● Results-driven Data Engineer and Machine Learning Specialist with 4+ years of combined experience across global enterprise environments
● Expertise in designing scalable ETL pipelines, cloud-native data solutions, and ML model development using Azure Data Factory, Databricks, AWS Glue, Snowflake, and TensorFlow.
● Proven ability to manage end-to-end data engineering lifecycles including data ingestion, processing, modeling, deployment, and visualization across hybrid cloud architectures (Azure, AWS).
● Hands-on experience with Azure Synapse, Azure ML Studio, and Snowflake, enabling the creation of highly available and real-time analytical ecosystems.
● Developed, trained, and deployed machine learning models for demand forecasting, churn prediction, and anomaly detection using PySpark, scikit-learn, and XGBoost.
● Strong background in Data Wrangling, Feature Engineering, Data Lakes, Delta Lake, and SQL-based data modeling with a keen eye for performance tuning.
● Created advanced business dashboards and data stories using Power BI, Tableau, and Excel, supporting executive decision-making.
● A strategic contributor in Agile teams with cross-functional collaboration across business, data science, and engineering units.
● Recognized for transforming complex business challenges into data-first solutions that improve delivery speed, reduce operational costs, and increase decision confidence.
● Holds certifications in Snowflake, Databricks, Google Analytics, IBM BI, and advanced Machine Learning modules.
SKILLS
Languages: Python, SQL, R, SAS
ML Libraries & Frameworks: Scikit-learn, XGBoost, TensorFlow, Pandas, NumPy, PySpark, Matplotlib, Seaborn, ggplot2
Data Engineering Tools: Azure Data Factory, Azure Synapse, Databricks, AWS Glue, SSIS, Alteryx, Informatica Databases: Snowflake, Azure SQL, MySQL, SQL Server, PostgreSQL, MongoDB Cloud & Platforms: Azure (Synapse, ADF, ML Studio, Blob Storage), AWS (Redshift, S3, Lambda, EC2), GCP
(BigQuery - basic)
Visualization & Reporting: Power BI, Tableau, Excel (Power Query, Pivot Tables) Version Control & DevOps: Git, GitHub, Azure DevOps, CI/CD pipelines Other: Data Modeling, Data Lakes, Delta Lake, EDA, Feature Engineering, Agile, SCRUM, Data Governance PROFESSIONAL EXPERIENCE
SIMFORM USA May 2024 – Present
Data Engineer
● Architected enterprise-grade ETL pipelines in Azure Data Factory and Databricks, ingesting data from multiple sources into Snowflake and Synapse for centralized analytics.
● Designed and trained supervised ML models using Azure ML Studio and scikit-learn for demand forecasting and customer segmentation, improving forecast accuracy by 18%.
● Built real-time reporting dashboards in Power BI integrated with Azure SQL Database, driving executive- level analytics with near-live insights.
● Developed feature stores using Databricks Delta Lake and managed feature versioning for reuse across multiple ML pipelines.
● Leveraged Azure DevOps to manage code versioning, pipeline automation, and testing workflows across development and production environments.
● Enhanced existing Spark jobs in PySpark to scale and optimize large data workloads by reducing runtime by 30%.
● Collaborated with BI teams and analysts to deliver reporting data marts that supported visualizations and self-service queries.
● Created CI/CD pipelines for automated model deployment and scoring, improving ML model time-to- market.
● Conducted data profiling and wrangling from Azure Blob Storage and on-prem sources to ensure clean and reliable data flow.
● Led ML lifecycle governance practices by versioning, testing, and logging all model experiments and artifacts.
FULLSTACK LABS USA July 2023 – Dec 2023
Data Engineer (Intern)
● Built and deployed ETL processes using SSIS and AWS Glue to integrate structured and semi-structured data from MongoDB, PostgreSQL, and on-prem ERP systems.
● Collaborated with the data science team to develop and deploy machine learning models for customer churn and fraud detection using XGBoost and TensorFlow on AWS EC2.
● Orchestrated model pipelines for batch inference and deployed model APIs using Flask and Docker.
● Created interactive Tableau dashboards to track model performance, business KPIs, and real-time anomaly detections.
● Used Python, Pandas, and SQL to perform EDA, feature engineering, and outlier treatment on complex datasets.
● Delivered insights to leadership by converting analytical findings into executive presentations.
● Automated data flows using AWS Lambda and S3, integrating with QuickSight for rapid reporting and cost efficiency.
● Migrated traditional Excel-based reports to dynamic dashboards, reducing manual reporting time by over 40%.
● Standardized data quality rules and validations to ensure robust data pipelines.
● Participated in cross-team workshops to align ML workflows with enterprise data strategies. CYGNET DIGITECH India Oct 2021 – July 2022
Data Analyst
● Led the migration of a legacy enterprise data warehouse to Snowflake and Google BigQuery, reducing report execution time by 50% and storage costs by 30%.
● Developed Airflow DAGs to automate ELT processes and ensure reliable orchestration across Snowflake and BigQuery environments.
● Created high-throughput ingestion pipelines using Google Cloud Storage, Dataflow, and Cloud Functions to load and process structured and unstructured data.
● Used Python and PySpark to perform data transformation, deduplication, and cleansing on raw telecom datasets prior to staging.
● Designed star-schema and snowflake-schema data models in BigQuery and Snowflake for sales and billing analysis, enhancing business query performance.
● Partnered with QA and DevOps teams to ensure CI/CD integration, version control, and regression testing for pipeline deployments.
EDUCATION
University of North Texas Denton, TX
Master of Science, Information Science, May 2024
Anna University Chennai, IN
Bachelor of Technology – Computer Science Apr 2021