Post Job Free
Sign in

AI/ML Engineer - Multimodal & NLP Specialist

Location:
Kolkata, West Bengal, India
Posted:
March 31, 2026

Contact this candidate

Resume:

Anish Sarkar

B.Tech (AIML), Dept. of CSE, Netaji Subhash Engineering College, Kolkata

Email: ***.********@*****.*** · College: ***********.********@****.**.** · GitHub: ans036 · Phone: +91-987******* EDUCATION

• B.Tech (AI & ML) — Netaji Subhash Engineering College, Kolkata 2021–2025 CPI: 8.77/10

• WBCHSE (XII) 93.2% (2021)

• WBBSE (X) 92.9% (2019)

PUBLICATIONS & MANUSCRIPTS

• Under Review (Interspeech 2026): “Convex Geometric Equilibrium for Audio-Visual Speaker Verification.” Multimodal face-voice fusion via strictly convex optimization; 6.47% EER, zero-shot 1.3% on VoxCeleb1. [GitHub]

• Under Review (JCON): “Biologically Consistent Universal Differential Equations for Tumor–Immune Dynamics.” [GitHub] https://proceedings.juliacon.org/papers/a0c88982993ebd268baecc1a2a70305a TECHNICAL SKILLS

Languages: Python, Java, Julia, SQL · Skills: Scientific ML, Deep Learning, NLP, Multimodal Fusion, Computer Vision, OCR, Backend Dev, DSA, Vector Databases · Libraries/Tools: TensorFlow, PyTorch, Scikit-learn, FastAPI, PostgreSQL, pgvector, Docker, OpenCV, Node, React, Flask, NumPy

INTERNSHIP EXPERIENCE

• AI/ML Engineer Intern — RAG-Based Agentic Knowledge System Feb 2026 – Present JD Jones

Designed and deployed an end-to-end RAG pipeline (LangChain, pgvector, HNSW, hybrid retrieval + RRF) with FastAPI backend, JWT auth, Docker/PostgreSQL; built agentic enquiry-to-quotation workflow; 90%+ test coverage.

Developed intelligent product selection assistant with semantic search; integrated LangGraph (ReAct), LangFuse

(observability), and QDoRA (4-bit quantization) for local SLM optimization.

• System Product Developer Intern — Bongo Shruti (Assistive Reading) [GitHub] Feb 2025 – Dec 2025 Jadavpur University & IIT Kharagpur

Co-developed Bongo Shruti: end-to-end Bengali assistive reading (OCR TTS); user studies with 204 visually impaired students.

Refactored Python/OpenCV pipeline, raising character recognition accuracy from 80% to 95%; poster at EMPOWER 2025 (IIT Delhi).

• Software Developer Intern Jul 2024 – Sept 2024

Jadavpur University

Built LLM-based NLP applications; launched an AI word-building game (123 users). 23% efficiency gain via Transformer + N-gram strategy.

KEY PROJECTS

• COSMIC — Multimodal Audio-Visual Speaker Verification [GitHub] Nov 2025 – Present Submitted to Interspeech 2026. Multimodal face + voice fusion with provably unique identity embeddings.

Formulated fusion as strictly convex energy minimization (I =H−1b, H 0)guaranteeing a unique global optimum; non-convex baselines (score fusion, concat, cross-attention) collapse to 25% EER vs. our 6.47%. Integrated KA-Spline geodesic projections (57 parameter reduction), VIB context disentanglement, and cross-attention aggregation atop frozen ArcFace + ECAPA-TDNN backbones (639K trainable params).

Zero-shot cross-dataset: 1.3% EER on VoxCeleb1, 4.8% on CREMA-D (trained on MAV-Celeb only). Emergent spoofing detection AUC=0.903 without supervised spoofing labels. Production-ready: 158 tests, 100% coverage, GPU-accelerated Docker deployment.

• Schizophrenia Classification using EEG [GitHub] Jul 2024 – Oct 2024 Under: Mr. Saikat Biswas (Faculty, CSE, NSEC) — Extracted PSD/DE/SE features; 50k scalp heatmaps; Chrononet CNN achieved 99% accuracy.

CONFERENCE PRESENTATIONS

• Presented: “Modeling Tumor-Immune Dynamics for Optimized Cancer Treatments,” JuliaCon 2025 (Lightning Talk), Carnegie Mellon University, Pittsburgh, July 23, 2025. [GitHub]

• Presented Poster: “Bongo Shruti: Bengali OCR-to-TTS Assistive Reading System,” EMPOWER 2025, IIT Delhi. [GitHub] ACHIEVEMENTS & INTERESTS

• IBM “Machine Learning Specialist — Associate” badge · Author of poetry book Tales of Grey (verify link)



Contact this candidate