Anish Sarkar
B.Tech (AIML), Dept. of CSE, Netaji Subhash Engineering College, Kolkata
Email: ***.********@*****.*** · College: ***********.********@****.**.** · GitHub: ans036 · Phone: +91-987******* EDUCATION
• B.Tech (AI & ML) — Netaji Subhash Engineering College, Kolkata 2021–2025 CPI: 8.77/10
• WBCHSE (XII) 93.2% (2021)
• WBBSE (X) 92.9% (2019)
PUBLICATIONS & MANUSCRIPTS
• Under Review (Interspeech 2026): “Convex Geometric Equilibrium for Audio-Visual Speaker Verification.” Multimodal face-voice fusion via strictly convex optimization; 6.47% EER, zero-shot 1.3% on VoxCeleb1. [GitHub]
• Under Review (JCON): “Biologically Consistent Universal Differential Equations for Tumor–Immune Dynamics.” [GitHub] https://proceedings.juliacon.org/papers/a0c88982993ebd268baecc1a2a70305a TECHNICAL SKILLS
Languages: Python, Java, Julia, SQL · Skills: Scientific ML, Deep Learning, NLP, Multimodal Fusion, Computer Vision, OCR, Backend Dev, DSA, Vector Databases · Libraries/Tools: TensorFlow, PyTorch, Scikit-learn, FastAPI, PostgreSQL, pgvector, Docker, OpenCV, Node, React, Flask, NumPy
INTERNSHIP EXPERIENCE
• AI/ML Engineer Intern — RAG-Based Agentic Knowledge System Feb 2026 – Present JD Jones
Designed and deployed an end-to-end RAG pipeline (LangChain, pgvector, HNSW, hybrid retrieval + RRF) with FastAPI backend, JWT auth, Docker/PostgreSQL; built agentic enquiry-to-quotation workflow; 90%+ test coverage.
Developed intelligent product selection assistant with semantic search; integrated LangGraph (ReAct), LangFuse
(observability), and QDoRA (4-bit quantization) for local SLM optimization.
• System Product Developer Intern — Bongo Shruti (Assistive Reading) [GitHub] Feb 2025 – Dec 2025 Jadavpur University & IIT Kharagpur
Co-developed Bongo Shruti: end-to-end Bengali assistive reading (OCR TTS); user studies with 204 visually impaired students.
Refactored Python/OpenCV pipeline, raising character recognition accuracy from 80% to 95%; poster at EMPOWER 2025 (IIT Delhi).
• Software Developer Intern Jul 2024 – Sept 2024
Jadavpur University
Built LLM-based NLP applications; launched an AI word-building game (123 users). 23% efficiency gain via Transformer + N-gram strategy.
KEY PROJECTS
• COSMIC — Multimodal Audio-Visual Speaker Verification [GitHub] Nov 2025 – Present Submitted to Interspeech 2026. Multimodal face + voice fusion with provably unique identity embeddings.
Formulated fusion as strictly convex energy minimization (I =H−1b, H 0)guaranteeing a unique global optimum; non-convex baselines (score fusion, concat, cross-attention) collapse to 25% EER vs. our 6.47%. Integrated KA-Spline geodesic projections (57 parameter reduction), VIB context disentanglement, and cross-attention aggregation atop frozen ArcFace + ECAPA-TDNN backbones (639K trainable params).
Zero-shot cross-dataset: 1.3% EER on VoxCeleb1, 4.8% on CREMA-D (trained on MAV-Celeb only). Emergent spoofing detection AUC=0.903 without supervised spoofing labels. Production-ready: 158 tests, 100% coverage, GPU-accelerated Docker deployment.
• Schizophrenia Classification using EEG [GitHub] Jul 2024 – Oct 2024 Under: Mr. Saikat Biswas (Faculty, CSE, NSEC) — Extracted PSD/DE/SE features; 50k scalp heatmaps; Chrononet CNN achieved 99% accuracy.
CONFERENCE PRESENTATIONS
• Presented: “Modeling Tumor-Immune Dynamics for Optimized Cancer Treatments,” JuliaCon 2025 (Lightning Talk), Carnegie Mellon University, Pittsburgh, July 23, 2025. [GitHub]
• Presented Poster: “Bongo Shruti: Bengali OCR-to-TTS Assistive Reading System,” EMPOWER 2025, IIT Delhi. [GitHub] ACHIEVEMENTS & INTERESTS
• IBM “Machine Learning Specialist — Associate” badge · Author of poetry book Tales of Grey (verify link)