Nabil Abdullah
LinkedIn • Chennai, India • **********@*****.*** • 979*******
Education
Georgia Institute of Technology Atlanta, GA
Master of Science (M.S.) in Computer Science, Artificial Intelligence April 2026
National Institute of Technology Trichy, India
B.Tech, Instrumentation and Control Engineering August 2018 – August 2022
Experience
Georgia Institute of Technology Atlanta, GA
Research Intern August 2025 – Present
Developed a two-stage FGVC pipeline (YOLO detection + Swin Transformer classification) for visually similar small species, achieving strong accuracy gains over end-to-end baselines.
Conducted dataset curation, evaluation, and interpretability analysis to validate model behavior on small-object recognition.
Implemented deployment-side optimizations including ONNX export and quantization for faster inference; paper in preparation.
Blubridge Chennai
AI Research Engineer Sep 2025 – December 2025
Architected and built a custom deep learning framework in C++/CUDA, enabling graph-based autodiff, forward graph replay, and fully GPU-resident training pipelines.
Designed the autograd system to support higher-order VJPs and correct backpropagation through fused, hand-written CUDA kernels.
Implemented Flash Attention and Flash ALiBi Attention from scratch in CUDA, using block reduction and warp-level primitives to scale across large batch, sequence, and head dimensions.
Authored numerically stable forward and backward CUDA kernels (Softmax, SWIGLU, RMSNorm), eliminating NaNs through careful kernel math and gradient accumulation.
Designed KV Cache–based autoregressive inference, enabling efficient long-context decoding entirely on GPU.
Built and trained a LLAMA-style transformer block (RMSNorm, SWIGLU, multi-head attention, residuals, linear projections) running end-to-end on GPU.
Implemented a production-grade BPE tokenizer in C++, embedding pipeline, and text loaders; achieved end-to-end training with coherent autoregressive text generation.
Explored Python bytecode tracing via pybind, identified performance and control limitations, and drove a C++-first execution and lowering strategy.
Verified correctness and stability via PyTorch parity checks and comprehensive unit/integration tests across 3D tensor pipelines.
SelfDirect Research Lab Chennai
Research Engineer March 2025 – August 2025
Designed and trained a 600M-parameter small language model optimized for ultra-long context processing (500K+ tokens) under consumer-GPU constraints.
Implemented custom CUDA kernels for state-space sequence modeling, including fused forward and backward passes.
Integrated sparse attention and sparse Mixture-of-Experts (MoE) layers to reduce memory and compute overhead.
Achieved ~60% training loss reduction on long-context language modeling tasks across multi-epoch training
Validated training stability across extended contexts and approximate gradient regimes
Boeing India Private Limited Bangalore
Programmer Analyst L2 Aug 2022 – Nov 2023
Spearheaded the modernization of legacy ASPX applications by transitioning to Angular, boosting system scalability by 30% and reducing page load times by 20%.
Engineered dynamic, high-performance web applications by implementing front-end enhancements in Angular and C#.
IIT Kanpur Kanpur
Research Intern June 2021 – August 2021
Deployed Python-based deep learning pipelines using TensorFlow to process and classify LiDAR-scanned terrains, reducing manual labeling time by 40% and enabling automated recognition of trees, buildings, and ancient ruins.
Academic Experience
Georgia Institute of Technology Atlanta
Project Author Feb 2025 – March 2025
Implemented and optimized Bitonic Sort in CUDA, applying memory-access and data-type optimizations to achieve significant speedups over CPU execution.
Georgia Institute of Technology Atlanta
Project Author Nov 2024 – Dec 2024
Designed custom reward functions for racing scenarios; trained using Deep Reinforcement Learning and evaluated policies in AWS DeepRacer simulations under dynamic constraints
Georgia Institute of Technology Atlanta
Project Author June 2024 – July 2024
Tuned multimodal models for hate-speech detection; compared performance against large multimodal models (Gemini, GPT-4) using structured prompting and robust API retry.
Georgia Institute of Technology Atlanta
Project Author January 2024 – Feb 2024
Implemented and compared Decision Trees, Boosting, SVMs, KNN, and Neural Networks for supervised learning; analyzed accuracy vs. data size and training dynamics.
Skills & Known Programming Languages
Machine Learning & AI: NLP, Reinforcement Learning, Data Processing, Model Evaluation, Dimensionality Reduction, Model Optimization, Hyperparameter Tuning, Debugging
Deep Learning: Transformers, RNN, CNN, ResNet, LSTM, Multimodal Models, Seq2Seq
GPU & High-Performance Computing: CUDA, Parallel Processing, Tiled Matrix Multiplication, Bitonic Sort, OpenMP
Quantum Computing: BB84 Cryptography, Quantum Logic Gates
Cloud & Software Development: TensorFlow, PyTorch, AWS, Microsoft Azure, .NET, Angular, Simulink, Debian, Git, Linux, Ubuntu, Keras, Langflow
Programming Languages: Python, CUDA C++, C++, C, C#, JavaScript, TypeScript, HTML, SQL, Assembly, Shell, Bash
Development Tools: VS Code, Visual Studio, GitHub, JupyterLab, Nsight Compute, ComfyUI, MLIR, LLVM