Nihal Mallikarjun
LinkedIn ***********.*@************.*** Github +1-857-***-****
EDUCATION
Master of Science in Data Analytics and Engineering June 2025 Northeastern University, Boston, MA
Relevant Courses: Computation and visualization, Product Development, Neural Networks and Deep Learning, Algorithms Bachelor of Technology in Computer Science and Engineering May 2023 SRM Institute of Science & Technology, India
Relevant Courses: Artificial intelligence, Database management systems, Compiler Design, Software Development Methodologies WORK EXPERIENCE
Business Intelligence Analyst - Community Dreams Foundation, USA June 2025 (Present)
• Built healthcare benefits reporting workflows supporting insurance plans, eligibility, recommendations, access insights for 5K+ users.
• Analyzed healthcare operational data to improve data quality, reporting accuracy, stakeholder decision-making across benefits workflows.
• Translated business requirements into KPI logic, metric definitions, data validation rules, and reporting requirements with product, QA, engineering teams.
• Partnered with product, QA and engineering teams in Agile sprints to define requirements, validate AI outputs and support release readiness through code reviews and pre-deployment checks. Data Analyst Intern – Stealth, USA Jan 2025 – April 2025
• Built RAG-based AI systems using Python, FastAPI, GPT-4o-mini and vector databases, processing 100K+ document chunks.
• Implemented multimodal retrieval pipelines using CLIP and transformer embeddings for semantic document search.
• Developed data processing pipelines using SQL, Python and Scikit-learn, improving data quality and reporting efficiency by 30%. Data Engineer – VedaInfo Inc, India July 2022 - July 2023
• Built a large-scale web crawler processing 500K–1M+ URLs/day using AWS Fargate, SQS, Lambda and CloudWatch monitoring.
• Automated metadata extraction/classification using Trafilatura, Pydantic schemas and validation saving 15 hours/week.
• Implemented, duplicate detection, production monitoring to improve reporting reliability reduce failed metadata records.
• Designed PostgreSQL star/snowflake schemas with Redis caching, improving read API latency from ~500ms to ~200ms. Business Intelligence Analyst Intern- Tectonas Softsolutions, India Dec 2021 – May 2022
• Built sales forecasting models using python, scikit-learn to analyze retail demand across 3 states, achieving 85% prediction accuracy.
• Performed feature analysis on qualitative and quantitative datasets to identify demand patterns,improved inventory planning decisions.
• Built Tableau dashboards and automated weekly KPI reports to track retail demand, inventory trends, and business performance. RESEARCH AND PROJECTS
Multimodal Agentic RAG system
• Built an Agentic RAG chatbot in a hackathon within 48 hours with dual-retrieval architecture (Basic RAG + multi-HyDE) and an intelligent query router that classifies queries across 5 strategies conversational, simple, complex, image, and tool-call to optimize retrieval quality while minimizing latency and API costs, Achieved 80% faithfulness score on grounded responses.
• Built a multimodal pipeline with cross-modal image retrieval and automated table extraction, enabling the system to search, describe, and cite figures, charts, formulas, and tables from documents alongside text with fallback to text-only RAG on failure.
• Implemented a self-correcting agentic loop with retrieval evaluation, query refinement, selective memory persistence, and conversation history served end to end via a REST API with a custom frontend and reproducible single-command evaluation harness. Automated Visa Application Prediction - Northeastern University, Boston, MA
• Designed an ML workflow to predict visa application outcomes using structured tabular data.
• Built and automated 6 modular pipelines (data ingestion, validation, transformation, model training, evaluation, and deployment) using Scikit-learn, stored them as artifacts and deployed a dockerized FastAPI model service on AWS EC2 with S3 artifact storage.
• Achieved 78% KNN model accuracy and implemented real-time data drift monitoring using Evidently AI. TECHNICAL SKILLS
• Programming Languages: Python (Advanced), JavaScript, R, SQL, JAVA
• Libraries/Frameworks: TensorFlow, React.js, Node.js, Express.js, scikit-learn, Tailwind CSS, Matplotlib, Seaborn, Power BI, OpenCV
• Databases/Technologies: MySQL, PostgreSQL, NoSQL(MongoDB,DynamoDB),AWS(S3,EC2,ECR),Docker, JIRA, Tableau, Flask, MSExcel, VLOOKUP XLOOKUP,GCP, Huggingfacetransformer,Vertex AI, Chroma DB, PineconeVectorDB,Snowflake,Pydantic,Trafilatura, Excel, Power BI, Tableau, Python, Data Quality, KPI Reporting, Data Storytelling, Healthcare Analytics
• Machine Learning: VGG16, Evidently AI, OpenAIAPI, CLIP, XGBOOST,NLP,DVC, MLFLOW,keras ACHIEVEMENTS
• Winner at Hackathon hosted by community Dreams Foundation for building a Multimodal Agentic RAG system.
• Vehicle Collision Detection and Alert System using YOLO IEEE Conference Publication