Post Job Free
Sign in

Data Engineer Machine Learning

Location:
Pune, Maharashtra, India
Posted:
October 15, 2025

Contact this candidate

Resume:

RUKSAR RAFIQUE LUKADE

Dallas, TX +1-508-***-**** **************@*****.*** LinkedIn

(Open to Relocate)

SUMMARY

Data Engineer with 3+ years of experience in data science, machine learning, and cybersecurity. Expertise in AML, fraud detection, and cyberattack prediction with strong domain knowledge in banking and financial services. EXPERIENCE

FEEDZAI Remote, CA

Data Engineer May 2024 - Present

AML and Cyberattack Prediction System (AI + MITRE ATT&CK) Banks face too many fraud alerts, and most turn out to be false alarms. At the same time, many fraud cases are actually symptoms of larger cyberattacks, but the systems don’t connect the dots

● Designed and deployed a real-time AML detection pipeline in Azure (Data Factory, Event Hubs, Databricks) to monitor transactions. Engineered fraud features such as transfer velocity, geolocation mismatches, and cross-border anomalies

● Developed and optimized SQL queries and data pipelines in Azure Databricks and Synapse Analytics to process and analyze large-scale healthcare and financial datasets with precision and compliance

● Built an ensemble model (XGBoost + Logistic Regression with SMOTE) that reduced false positives and improved recall, achieving 300 ms scoring latency with Azure Functions

● Extended fraud detection with cybersecurity intelligence by integrating MITRE ATT&CK tactics/techniques. Developed Markov Chain and PyTorch LSTM models to predict attacker behavior sequences and identify patterns behind suspicious activities

● Developed a fraud-cyber fusion engine that correlates flagged transactions with cyber telemetry (failed MFA attempts, IP reputation, device anomalies, WAF/endpoint alerts) to distinguish stand-alone fraud from attack-driven financial activity

● Delivered analyst-friendly interactive Jupyter simulations and Power BI dashboards with SHAP-based explanations, enabling SOC and compliance teams to visualize anomalies, predicted attack paths, and risk scores in real time

● Implemented a feedback loop where investigation results flow back into model retraining, continuously improving precision and reducing false positives

INFOSYS LIMITED Pune, IN

Data Scientist Nov 2021 - June 2023

Client: UBS

1.Net Interest Income (NII) Forecasting Solution - Azure ML & Power BI Banks rely on Net Interest Income (NII) forecasts to guide lending, deposit pricing, and hedging, but the old process took months and lacked accuracy. The client needed a scalable AI solution to deliver faster, more reliable forecasts and reduce financial risk

● Assisted in developing an Azure ML-powered Net Interest Income (NII) forecasting solution for the client’s treasury division, which improved forecasting accuracy and supported proactive lending, deposit pricing, and hedging strategies

● Helped consolidate and preprocess loan/deposit portfolio data, customer behavior trends, and macroeconomic indicators from Azure Data Lake Gen2 and Azure Synapse Analytics

● Contributed to feature engineering in Azure Databricks (rate-sensitivity, repricing) and supported the development of a hybrid forecasting model (LightGBM regression + Prophet) to capture both non-linear and seasonal effects

● Supported automation of monthly retraining and stress-scenario simulations using Azure ML Pipelines and assisted in preparing Power BI Embedded dashboards for treasury executives

● Part of the team that delivered measurable impact by enhancing forecast accuracy by 21% and enabling early hedging actions valued in the tens of millions, while reducing reporting time from week to days 2.Customer Segmentation & Retention Analytics - Azure ML & Databricks UBS wanted to target customers better, but their campaigns were too generic, open rates were low, unsubscribes were high, and valuable customers were leaving. The marketing data wasn’t properly tagged or segmented, so there was no way to deliver relevant, personalized offers

● Assisted in developing an Azure ML-based customer segmentation solution that supported targeted marketing campaigns and retention strategies for retail banking customers

● Helped ingest and combine demographic, product usage, transaction, and engagement data from Azure Synapse Analytics, performing preprocessing tasks such as feature scaling, outlier handling, and data aggregation in Azure Databricks

● Supported dimensionality reduction using PCA and contributed to clustering experiments with K-Means (K=5) in Azure Machine Learning to generate customer personas

● Assisted in automating monthly re-clustering with Azure ML Pipelines and integrating outputs into Power BI dashboards for marketing analytics teams

● Part of the project team that improved targeted offer acceptance rates and reduced churn in high-value customer segments, driving measurable business impact 3.Portfolio Optimization & Risk Analytics - Azure ML & Databricks Wealth advisors at UBS needed a smarter way to recommend personalized investment portfolios. The existing process was manual and rule-based, leading to suboptimal allocations and missed opportunities for risk-adjusted performance

● Contributed to developing an ML-powered portfolio optimization framework that forecasted asset returns and risks using LightGBM regression and Monte Carlo simulations

● Assisted in integrating Modern Portfolio Theory (MPT) to construct optimized client portfolios aligned with different risk profiles and investment horizons

● Helped prepare Power BI dashboards for advisors to simulate allocations, compare Sharpe ratios, and assess portfolio rebalancing strategies

● Improved advisor productivity and client satisfaction by delivering more personalized and higher-performing portfolio recommendations while reducing manual portfolio construction efforts ADOLF SOLUTIONS PRIVATE LIMITED Pune, IN

Intern Oct 2020 - Nov 2020

● Surface Mount Device (SMD) Soldering & Surface Mount Technology (SMT Soldering) enhance hardware development skills

● Printed Circuit Board (PCB) Making and Product Testing & Error Debugging, improving product reliability and performance

RESEARCH WORK

Master’s Thesis in Cybersecurity with AI Domain-

Attack Chain Contraction & Prediction Using Markov Model and LSTM on MITRE ATT&CK Data

● Built an end-to-end hybrid attack chain prediction framework integrating probabilistic modeling (first-order Markov Chains) with deep learning (multi-layer PyTorch LSTM) to forecast adversary techniques from partial sequences

● Engineered data ingestion and preprocessing pipeline to parse and encode MITRE ATT&CK v16.0 datasets

(Techniques, Relationships, Groups, Campaigns) into structured, tactic-preserving sequences for simulation and training

● Developed a probabilistic simulation engine to generate realistic attack chains with geometric-mean–based risk scoring (Low/Medium/High) and tactic-order validation

● Trained an LSTM model with embeddings, dropout, and multi-step recursive inference, achieving ~72% next-step prediction accuracy vs. 38% for Markov baseline

● Implemented a Chain Contraction algorithm using redundancy removal, SoftMax confidence thresholds, and entropy filtering to reduce chain length by up to 40% while preserving strategic context

● Validated model predictions against real-world campaigns (APT29’s Operation Ghost & SolarWinds) using Jaccard similarity, overlap scoring, and group-level matching

● Created an interactive Jupyter Notebook UI with ipywidgets and PyVis for real-time attack path simulation, multi-model comparison, and graph-based visualization

● Performed model comparison and analysis: MSE between Markov & LSTM probabilities, vocabulary overlap, chain likelihood discrimination (AUC), and per-technique error profiling ACADEMIC PROJECTS

Library Management

● Developed a system using Node.js, ReactJS, and MongoDB with user registration, book catalog management, transactions, search/reporting, and API testing via Postman

● Designed a dynamic ReactJS interface for enhanced user experience and managed scalable data with MongoDB for a responsive application

Predicting-30-Day-Hospital-Readmissions-for-Diabetic-Patients

● Developed an ML model to predict 30-day hospital readmissions for diabetic patients, utilizing advanced predictive analytics and data-driven insights

● Implemented data preprocessing, feature engineering, and model training, resulting in a robust predictive tool that enhances patient care and healthcare resource allocation TECHNICAL SKILLS

● Programming & Scripting: Python (NumPy, Pandas, Scikit-learn, Matplotlib, PyTorch, TensorFlow, Keras), SQL, Java, C++, JavaScript, COBOL, Postman

● Machine Learning & AI: Supervised & Unsupervised Learning (Regression, Classification, Clustering), Deep Learning (CNNs, LSTMs, Transfer Learning), Time Series Forecasting (Prophet, ARIMA, LightGBM), Markov Models, Predictive Analytics, NLP (Word2Vec, TF-IDF, Text Classification

● Data Engineering & MLOps: Azure Machine Learning, Azure Databricks, Azure Data Factory, Azure Synapse Analytics, Azure Functions, AKS, CI/CD for ML models, Feature Engineering, Data Preprocessing, Parallel Computing, Scientific Computation

● Visualization & Analytics: Tableau, Power BI, Excel, PyVis, NetworkX, Jupyter Notebook, Interactive Dashboards

● Domain Expertise: Banking & Financial Services (AML, Treasury Risk & Forecasting), Cybersecurity (MITRE ATT&CK, APT Campaign Analysis, Risk Scoring), Healthcare Analytics, Digital Forensics, Predictive Maintenance

● Databases: MongoDB, PostgreSQL, DB2

● Tools & Frameworks: Git, Docker, Postman, SqBx (Package Tracking), Node.js, ReactJS EDUCATION

UNIVERSITY OF MASSACHUSETTS, DARTMOUTH Massachusetts, US Master of Science in Data Science (GPA: 3.9/4.0) Aug 2025 AISSMS INSTITUTE OF INFORMATION TECHNOLOGY Pune, India Bachelor of Electronics and Telecommunication Engineering (GPA: 8.0/10.0) May 2021 University of Massachusetts Dartmouth (On-Campus Jobs) Sept 2023 - Aug 2025

● Mail & Package Center Clerk May 2025 - Aug 2025

Operated SqBx software to log, track, and manage packages with barcode scanning and chain-of-custody tracking

Maintained accurate delivery records, enabling real-time visibility and automated notifications

Monitored package volumes and workflow metrics to improve efficiency and turnaround time

● Student Assistant Jan 2025 - May 2025

Supported students with concepts in network security, intrusion detection, and incident response. And collaborated with professor to ensure academic integrity and provide timely feedback

Graded assignments, projects, and exams for CIS 444/602 - Cyber Defense and Operations course

● Graduate Teaching Assistant Aug 2024 - Dec 2024

Assisted in delivering lectures and lab sessions for CIS 542: Digital Forensics course, focusing on evidence preservation, identification, extraction, and documentation in computing environments

Graded assignments and exams, while providing guidance to students by clearing doubts and resolving lab-related issues



Contact this candidate