RIYA SONI
Hoboken, NJ 201-***-**** ******@*******.*** LinkedIn Github
EDUCATION
Stevens Institute of Technology, Hoboken September 2024 – May 2026 Master of Science, Computer Science(MSCS) GPA: 3.5 Relevant Coursework: Deep Learning, Natural Language Processing, Computer Vision, Information Retrieval, Artificial Intelligence.
Indus University August 2020 – May 2024
Bachelor of Technology in Computer Science Engineering GPA: 3.9 Relevant Coursework: Data Structures and Algorithms, Machine Learning, Java, Data Science, Natural Language Processing, Probability and Statistical Mathematics, Linear Algebra. SKILLS
Languages: Python, C++, SQL, Java
AI/ML Frameworks: PyTorch, TensorFlow, LangChain, Keras, Scikit-learn. Cloud & Tools: Amazon Redshift, Docker, Git, Pandas, NumPy. Concepts: NLP, RAG (Retrieval-Augmented Generation), Computer Vision, System Design. EXPERIENCE
Giraffe Media Group Jun 2025 – Present
Data Analyst Intern (AI & Data Engineering) West Palm Beach, Florida
Architected a Text-to-SQL pipeline using LangChain and GPT-4, creating a semantic layer over Amazon Redshift to enable non-technical querying of TB-scale datasets.
Implemented dynamic schema mapping and validation logic in Python to handle schema drift, reducing SQL generation errors by 40% and manual data requests by 90%.
Streamlined business decision-making by automating data retrieval and visualization workflows, significantly boosting efficiency and enabling real-time, self-service data exploration for the Business Development team. Freshers Booth Jun 2024 – Aug 2024
Machine Learning Engineer Intern Ahmedabad, India
Fine-tuned BERT-based models using PyTorch for custom entity recognition, achieving a 15% increase in F1-score on domain-specific text data.
Designed automated data cleaning scripts in Pandas that removed duplicates and corrected labeling errors, improving training dataset quality by 20%.
Optimized backend processes in Python, resolving over 20 critical bugs and enhancing system performance by 30% during peak usage, ensuring 99.9% uptime and seamless functionality for end-users. Acencore Technology Jan 2024 – May 2024
AI Engineering Intern Ahmedabad, India
Engineered an automated data preprocessing pipeline using Pandas and NumPy to clean and tokenize raw text inputs, reducing data preparation latency by 20% and ensuring consistent input formatting for NLP models
Developed RESTful API endpoints using Flask/FastAPI to serve model predictions to the frontend, optimizing payload structure to minimize response time for end-users.
Authoring comprehensive technical documentation and unit tests for NLP modules, ensuring 100% test coverage for critical paths and facilitating smoother onboarding for future developers. PROJECTS
Multi-Cuisine Ingredient Classifier (ResNet/CNN) Python, TensorFlow, Google API Dec 2023
Built a custom image classification pipeline using TensorFlow and Transfer Learning (ResNet50). Implemented dimensionality reduction (PCA) to optimize feature extraction, improving categorization accuracy by 20%.
The reliable foundation established by this methodology significantly improved the performance of subsequent machine learning models based on TensorFlow and Keras used in analysis
This approach utilized ingredients sourced from diverse cuisines around the world and incorporated the translate API of Google to ensure accurate translations and consistent data representation. NGO Portal Java, SQL Mar 2023
Developed a networking platform for NGOs and volunteers, facilitating user profile creation, connection requests, and job listings to enhance collaboration and resource sharing within the nonprofit sector. SkimLit: Medical Abstract Structuring (NLP) TensorFlow, Tokenization, Embedding, NLP Sept 2023
Reproduced the PubMed 200k RCT paper results by building a multimodal model with Bi-LSTMs and Transfer Learning, achieving 85% accuracy in classifying abstract sentences.