AI Internship Candidate with Data Science Focus

Location:

Posted:

May 20, 2026

Resume:

Jasmine Christopher

linkedin.com/in/jasmine-christopher github.com/jasmine6789 portfolio R *******@*****.*** Impact-driven Software and Data Engineer with 3+ years of experience building scalable backend systems, large-scale data pipelines, and AI-powered applications, with a proven track record of delivering production-grade solutions across cloud and distributed environments. Work Experience

Research Assistant January 2026 – Present

Indiana University under Prof. Vivek Astavek Bloomington, IN

• Implemented Python-based ETL pipelines (PyArrow, Pandas) to process 6.3K+ Parquet files ( 1.8TB), implementing schema validation, column profiling, and cross-file consistency checks, with predicate pushdown and column pruning to optimize I/O.

• Built distributed batch workflows on BigRed200 HPC using SLURM job arrays, enabling parallel file-level processing, scalable random and stratified sampling for NLP tasks, and efficient Parquet-to-CSV conversion for downstream model pipelines. Research Assistant February 2026 – Present

Eskenazi School of Art, Architecture + Design under Prof. Christopher Reinhart Bloomington, IN

• Designed a scalable PostgreSQL schema for INCEpTION and HathiTrust annotation data, hosted on DigitalOcean, enabling efficient querying across multi-source datasets and integration with Jenks-based visualizations for IU-hosted web deployment.

• Configured and deployed the INCEpTION annotation platform on DigitalOcean with a MariaDB backend, enabling concurrent multi-user annotation and scalable corpus processing workflows. AI/ML Engineer - Founding Team June 2025 – December 2025 riAI Capital Ltd Remote

• Architected and deployed RAG pipelines on AWS (S3, Lambda) for IPS generation, using LlamaIndex for document ingestion and chunking, Pinecone for vector search, and GPT models via AWS Bedrock for grounded, context-aware generation, preparation time by 80%.

• Built a real-time tax optimization system using Kafka streams and Llama-3, designing structured prompts and ranking logic to generate grounded, auditable recommendations, improving decision speed by 40% for 20+ advisors.

• Designed LLM evaluation workflows using RAGAS and curated “gold” datasets ( 100 samples), measuring faithfulness and answer relevancy across retrieval and prompt configurations.

• Iterated on retrieval and prompt orchestration (hybrid search, BGE embeddings, re-ranking, chunking), improving grounding and reducing hallucinated outputs in complex multi-step queries.

• Collected and preprocessed 10K structured and unstructured financial data samples to support QLoRA-style adaptation workflows and improve retrieval quality and domain alignment.

• Engineered FastAPI/GraphQL integrations across Wealthbox, RightCapital, Schwab, and Black Diamond, supporting 50K+ monthly API requests via AWS API Gateway.

• Implemented zero-trust data governance (IAM, AES-256, audit logging) aligned with SOC2 and GDPR compliance requirements. Software Engineer January 2023 – July 2024

Mindsprint Chennai, India

• Developed core services for MESLite, an in-house MES platform built on C#/ASP.NET Core and deployed on AWS (EC2, RDS) to replace third-party MES software, estimated to save roughly $50K in annual licensing costs per facility.

• Built low-latency REST/gRPC services in ASP.NET Core to stream live machine telemetry to Angular dashboards, enabling real-time monitoring and reducing unplanned downtime by 35%.

• Migrated plant telemetry ingestion from batch to event-driven pipelines using C# and AWS Lambda, persisting streams to SQL Server (RDS) with CloudWatch monitoring, reducing end-to-end latency from 15–30 minutes to under 10 seconds.

• Designed batch ETL pipeline using AWS Lambda with S3 staging to ingest and transform XML-based machine and ERP production data with optimized indexing on high-frequency columns, improving data freshness and reducing manual reconciliation effort by roughly 60%.

• Implemented RBAC and MFA for operator and admin access, with xUnit and Postman API tests in CI, and delivered SSRS reports along with live API-driven dashboards used by plant managers to track KPIs, reducing manual reporting effort by 10 hours per week.

• Deployed to Vietnam as the sole on-site engineer for a live MES rollout, independently diagnosing and resolving production bottlenecks through data analysis (e.g., tank-level downtime patterns), contributing to 30% productivity improvement. Technical Skills

Programming Languages: Python, SQL, C#, TypeScript, Quickscript Databases: PostgreSQL, SQL Server, MySQL, DynamoDB, Neo4j Analytics & Visualization: Tableau, Power BI, SSRS Cloud & Distributed Systems: AWS ( API Gateway, EC2, RDS, IAM, Bedrock, SageMaker), Docker, Kubernetes, Terraform, SLURM (HPC) Data Engineering: Pyspark, Apache Kafka, Redis, ETL/ELT Pipelines, Data Modeling, Schema Design, Distributed Data Processing Backend & APIs: FastAPI, .NET Core, Node.js, REST, GraphQL, gRPC Frontend: Angular, React, HTML5, CSS/SCSS

AI/ML Systems: RAG, LLMs, Prompt Engineering, LangChain/LangGraph, Pinecone, RAGAS, PyTorch, scikit-learn, TensorFlow, Keras Certifications

AWS Certified Solutions Architect AWS Certified ML Engineer Associate DataCamp SQL Associate Education

Master of Science in Data Science August 2024 – May 2026 (Expected) Indiana University - Bloomington, USA GPA: 3.7

Contact this candidate