Aryan Rahul Verma
***************@*****.*** linkedin.com/in/aryan-rahul-verma github.com/aryanrahulverma EDUCATION
University of Texas at Arlington – Master of Science (Computer Science), GPA: 3.92/4.0 Aug 2022 - May 2024 Relevant Coursework: Software Engineering, Web Data Management, Information Security, Cloud Computing and Big Data NMIMS University – Bachelor of Technology (Computer Engineering), GPA: 3.77/4.0 Jul 2018 - May 2022 Relevant Coursework: Data Structures, Algorithms, Object Oriented Programming, Operating System IBM – Honors (Artificial Intelligence and Machine Learning) Sep 2019 - May 2022 Relevant Coursework: Natural Language Processing, Neural Networks, Pattern Recognition SKILLS
Languages: Python, C++, R, JavaScript, HTML, CSS, PHP, SAS, SQL, NoSQL, Scala Data Engineering: MySQL, MongoDB, Anaconda, Spark, Kafka, Docker, Kubernetes, Airflow, Snowflake, Databricks, AWS Web Technologies: ReactJS, AngularJS, Laravel, Git, Figma, Bootstrap, Tailwind, Flask, REST API, Streamlit, JSON Machine Learning Libraries: TensorFlow, Sci-kit Learn, NumPy, Pandas, Matplotlib, OpenCV, NLTK, MTCNN, PyTorch Tools and Techniques: Excel, Power BI, SAS Visual Analytics, GitHub, Tableau, VS Code, Agile, Scrum, Microservices EXPERIENCE
AI Tutor – xAI, Dallas Fort Worth Metroplex Nov 2024 - Current
• Delivered high-quality data to improve Grok by annotating data across domains using proprietary software.
• Actively participated in RLHF(Reinforced Learning from Human Feedback) project and provided high-quality data.
• Aided in developing a criteria-based verifier model that would auto-evaluate and score Grok-generated responses to aid in RLAIF(Reinforced Learning from AI Feedback).
• Assisted in training of Grok 3, which an Elo score of 1402 in the Chatbot Arena. Software Development Intern – NodeDa, Dallas Fort Worth Metroplex Sep 2024 – Nov 2024
• Developed a full-stack website with ReactJS, Material UI, and Firebase, enhancing user engagement and achieving a 20% improvement in front-end performance
• Enhanced backend data handling by integrating NoSQL databases, resulting in a 15% increase in data retrieval speed
• Increased project stability by 40% by implementing Agile practices, collaborating on Jira, and making over 310 commits on GitHub with 7-day deadlines, enabling faster, more reliable deployments Data Researcher – University of Texas, Arlington Jul 2024 – Nov 2024
• Streamlined research by summarizing 20+ LLM jailbreak papers, reducing irrelevant content by 50% and improving research
• Automated LLM testing using various APIs (ChatGPT, Claude, Gemini), improving workflow efficiency by 40% and enhancing reliability, leading to faster detection of vulnerabilities and stronger model security via collaborative efforts
• Executed advanced fine-tuning strategies on the LLM, achieving a 15% performance boost, which minimized model vulnerabilities
Machine Learning Intern – Maximus Infoware Pvt. Ltd., India May 2021 - July 2021
• Achieved a 74% fraud detection accuracy by developing predictive decision tree models using TensorFlow
• Accelerated training efficiency by 25% by processing 150,000+ transactions using an ETL pipeline with MSSQL and Pandas
• Drafted data insights through 10+ Power BI dashboards, facilitating informed decision-making by stakeholders
• Deployed ReactJS dashboards on AWS EC2 to classify 5 major fraud patterns, adhering to web security best practices PROJECTS
Forecasting – COVID-19 Cases Forecasting Open-Source (~10 hours)
• Integrated time-series ARIMA forecasting model in a DAG pipeline, achieving 95% accuracy in real-time predictions, enabling faster insights and reducing manual reporting efforts by three hours weekly
• Created interactive visualizations with SAS Visual Analytics, delivering actionable insights through user-friendly dashboards, improving stakeholder response strategies by 40%
MLOps – Attendance Capturing System Published Web-App (~80 hours)
• Orchestrated a real-time MLOps architecture using MTCNN and OpenCV with an accuracy of 93% for automating attendance capture using facial recognition on Streamlit dashboards, reducing manual intervention by 50%
• Shaped an end-to-end data processing pipeline using Apache Airflow for automating data ingestion, preprocessing, and storage, with DVC for version tracking and AWS S3 for dataset storage