Joel Jojo
Tucson, Arizona +1-520-***-**** ************@*****.***
Personal Summary
Data Analyst with strong proficiency in SQL, cloud data warehouses, and dashboard development. Proven track record in automating ETL pipelines to enhance reporting speed and data accessibility by over 25%. Experienced in collaborating with cross-functional teams to translate data into actionable insights. Eager to leverage advanced analytics expertise to drive impactful decision-making at Monzo. Education
The University of Arizona Aug 2023 - May 2025
MS, Data Science
• GPA: 3.75/4.0
The University of Arizona Aug 2023 - May 2025
Graduate Certificate, Natural Language Processing
• GPA: 4.0/4.0
St. Xavier's College Jun 2018 - Jun 2021
Bachelor of Science, Information Technology
Experience
The University of Arizona Jan 2025 - Present
Lead Researcher, Vision, Systems and Intelligence Lab
• Engineered a multi-modal deep learning pipeline integrating Vision Transformers (ViT) and BERT, enabling joint disease classification and automated report generation from chest X-rays.
• Mitigated class imbalance and improved rare disease detection using distribution-balanced focal loss and stratified sampling, boosting F1 scores up to 0.78 for key pathologies.
• Drove iterative model enhancements, benchmarking U-Net, Swin UNETR, BioMedCLIP, Clinical-T5, and Flamingo-CXR, achieving BLEU-4 up to 0.35 and ROUGE-L up to 0.42 for report generation.
• Developed robust preprocessing and augmentation pipelines for both imaging and clinical text, optimizing feature extraction and model generalization.
• Delivered a scalable, clinically validated AI framework with automated evaluation tools, reproducible scripts, and state-of-the-art multi-label classification and report synthesis performance.
• Boosted report generation quality by using RAG, retrieving context from 50K+ reports and achieving high retrieval precision on rare pathologies.
The University of Arizona Sep 2024 - Jan 2025
Data Analyst, Biosphere 2
• Built real-time dashboards using Python and Plotly, enhancing self-serve analytics and increasing research team efficiency by 25% through automated reporting and interactive visualizations.
• Engineered ETL pipelines on AWS to ensure data integrity and high availability for ecosystem datasets exceeding 2 million records, supporting ad-hoc analysis and optimizing data accessibility.
• Collaborated cross-functionally with research and IT teams to implement containerized microservices, enabling scalable modular analytics workflows and rapid feature deployment.
LTIMindtree Jun 2021 - Jul 2023
Data Analyst, Paramount Global project
• Maintained and optimized data warehouses (Snowflake, PostgreSQL) and automated ETL pipelines using Python and SQL, supporting real-time analytics for 10M+ event records while reducing data integration latency by 30%.
• Developed interactive self-serve dashboards and backend data APIs for business reporting, facilitating clear communication of actionable insights to non-technical stakeholders.
• Applied cloud-native solutions on Azure and AWS and contributed to infrastructure-as-code practices, enhancing deployment consis- tency and supporting scalable analytics platforms. Competitions and Research
Authorship Verification Challenge: Identifying Writers Through Stylometry Mar 2024 - May 2024 Kaggle
• Developed NLP-based models for writer attribution Movie/TV Show Review Classification and Sentiment Analysis Challenge Oct 2024 - Dec 2024 Kaggle
• Implemented deep learning sentiment classification models Sentiment Classification Challenge: Predicting Emotions with NLP Oct 2024 - Dec 2024 Kaggle
• Achieved high accuracy in emotion prediction using NLP techniques Core Skills
• Data Science & Analytics: Predictive modeling, Statistical analysis, Machine learning, NLP, Healthcare analytics, EHR/claims data, Regression, Hypothesis testing, Data mining, Data wrangling
• Programming & Tools: Python, R, SQL, SAS, Scala, Pandas, NumPy, SciPy, Power BI, Tableau, Google Data Studio, Excel, Matplotlib, Seaborn, Plotly
• Cloud Infrastructure & System Design: AWS, Azure, GCP, Docker, Kubernetes, Cloud-native architecture, Distributed systems, Microservices, RESTful APIs
• Full Stack Development: Exposure to React, Node.js, Next.js, API development, integrating backend analytics with frontend dashboards
• Data Engineering & Management: ETL, Data integration, Data transformation, Data warehousing, Data governance, Data security, Data privacy (HIPAA, PHI, PII), Data modeling
• Business Intelligence & Communication: Dashboarding, Data visualization, Data storytelling, Technical documentation, Stakeholder communication, Cross-functional collaboration, Business-First Mindset