MUSKAAN MAHINDRAKAR
551-***-**** Bayonne, Jersey City, United States **********@*****.*** linkedin.com/in/muskaan-mahi/ DATA ENGINEER
Data Engineer with 7 years of experience building cloud-native data platforms on AWS and GCP. Experienced in Python, SQL, Spark, Kafka, Airflow, and Snowflake. Designed and deployed ETL/ELT pipelines for real-time and batch systems. Improved data quality, observability, and governance across large-scale environments. Delivered scalable solutions supporting machine learning workflows and analytics, with a focus on performance, reliability, and automation. STRENGTHS AND EXPERTISE
Programming Languages: Python, SQL, Java, Scala, C++, Shell Scripting, R Program Big Data Technologies: Hadoop, Spark, Kafka, Hive, Snowflake, Airflow, MongoDB, Docker, Git Cloud Platforms: AWS (S3, EMR, Lambda, Glue, Redshift), Azure (Data Factory, Synapse, Databricks), GCP (BigQuery, Dataflow, Pub/Sub), VertexAI, AzureAI Data Analysis Tools: Tableau, Power BI, QuickSight Machine Learning Tools: Scikit-learn, TensorFlow, SageMaker, MLflow (Keras, PyTorch, LightGBM, XGBoost, OpenCV, NLTK, SpaCy, Seaborn, SciPy) Methodologies: Agile, Scrum, CI/CD, SDLC
Gen AI: Fine-tuning LLMs (GPT, BERT, and T5), Hugging Face Transformers, OpenAI API Leadership: Sprint Planning, Retrospectives, Backlog Grooming, Scrum Master Support PROFESSIONAL EXPERIENCE
Data Engineer
AT&T Bedminster, New Jersey
March 2024 - Present
●Architected and deployed ETL pipelines using BigQuery, Dataproc, and Spark, improving query performance by 25% and reducing storage costs by 20%.
●Constructed scalable Kafka-Snowflake pipelines for multi-terabyte financial data, reducing latency by 40% and increasing throughput by 3x.
●Developed fraud analytics platforms on AWS and Azure with Redshift, Glue, and IAM controls, reducing audit risks by 30% under SOX and GDPR.
●Automated infrastructure provisioning with Terraform and CloudFormation, accelerating infrastructure delivery by 40% and ensuring high availability.
●Deployed RESTful APIs using Flask and FastAPI for real-time predictions; connected MongoDB and DynamoDB to reduce latency by 35%.
●Integrated anomaly detection in ETL workflows, enhancing fraud detection accuracy by 15% and reducing false positives.
●Created real-time dashboards using Tableau and QuickSight, improving KPI visibility and analyst efficiency by 30%.
●Built CI/CD pipelines with GitHub Actions, Airflow, and Snowflake Tasks, reducing deployment errors by 40% and enabling zero-downtime releases. Data Engineer
North Highland Atlanta, Georgia June 2023 - January 2024.
● Implemented end-to-end MLOps workflows using MLflow and Docker, improving deployment speed by 35% and ensuring consistent, scalable model delivery.
●Executed distributed data processing using Spark DataFrames and reusable Airflow DAGs, improving pipeline speed by 85%.
●Engineered cloud-native architectures with AWS Redshift, Lambda, and S3, increasing data availability and scalability by 40%.
●Optimized SQL and Athena queries through partitioning and compression strategies, reducing query execution costs by 35%.
●Synthesized synthetic datasets using Gen AI and Python, increasing model accuracy by 18% and improving robustness across edge cases. Data Engineer
Bharat Heavy Electricals Limited Bengaluru, India December 2018 - June 2022
●Formulated ETL workflows using Apache Airflow, AWS Glue, and BigQuery, reducing data latency by 50% and improving data reliability.
●Orchestrated ML models (XGBoost, LSTM, BERT) into production using SageMaker and GCP AutoML, cutting inference latency by 40% for real-time scoring.
● Implemented CI/CD pipelines for ML workflows using Terraform, GitHub Actions, and Docker, reducing deployment time by 40% and enabling automated monitoring.
●Created dashboards in Tableau and Power BI for real-time data visibility, reducing manual reporting effort by 35%.
●Managed feature pipelines using Snowflake and Feature Store, accelerating model training by 30% and improving consistency across versions.
●Leveraged GCP AutoML and SageMaker Pipelines to reduce manual effort by 50% and accelerate model lifecycle deployment.
●Propelled multi-region failover systems in AWS Cloud, increasing availability by 99.9% during peak loads and reducing downtime risk.
●Facilitated Agile processes, boosting collaboration across data science and engineering teams and increasing sprint velocity by 25%. KEY PROJECTS: Real-Time Transaction Monitoring for Fraud Detection, Credit Risk Scoring Pipeline - AutoML, Cloud Migration for Regulatory Data Warehousing CERTIFICATIONS:
● Python: 60 Hour Training Program (ISO-Certified)
● AWS Partner Courses: Completed Cloud Practitioner, Data Lake and others.
● Dataiku Academy Certifications: Core Designer, ML Practitioner, Advanced Designer, Developer, MLOps Practitioner
● Databricks Certifications: ML/Data Science
● Java Certification: ISO Certified Core and Advanced
● Big Data Hadoop: Course Completion Certificate
● Machine Learning: Internship Training (ISO-Certified) EDUCATION:
Master of Science Computer Science - Stevens Institute of Technology Relevant Coursework: Deep Learning, Machine Learning, Natural Language Processing, Knowledge Discovery & Data Mining, Data Acquisition, DBMS Bachelor of Technology Computer Science - NIIT University Relevant Coursework: ML, Cloud, NLP, Data Structures, Big Data, Information Retrieval, Image Processing, Design & Analysis of Algorithms.