Maryam Akbarpour
*********.******@*****.*** 650-***-**** linkedin.com/in/maryam-akbarpour
Machine Learning Engineer - Software Engineer (ML Focused) Results-driven engineer with experience designing, training, and deploying ML systems in production. Skilled in implementing ML models for risk analytics, NLP, and intelligent decision systems, and building scalable data pipelines, fine-tuning transformer models. Passionate about leveraging ML and AI to solve problems in real-world applications. Technical Skills
Programming: Python, SQL, C++, Bash
AI & ML: Deep Learning, Neural Networks, NLP, LLMs, Regression, Classification, Clustering, Ensemble Methods Frameworks: PyTorch, TensorFlow, Scikit-Learn, Hugging Face Transformers, PySpark Data & Tools: Pandas, NumPy, Matplotlib, Plotly, FastAPI, Docker, Git, Google Cloud Platform (GCP) Work Experience
Machine Learning Engineer at Riskify Inc. (AI risk analytics startup), New York Sep 2023 - Dec 2024
• Designed and implemented end-to-end data pipelines to aggregate and normalize geospatial risk datasets from multiple public sources, creating address-level features used in underwriting analysis.
• Built and optimized ML-based risk scoring models (CatBoost, XGBoost), improving regional hazard score accu- racy by 15% and enabling transparent, auditable underwriting decisions.
• Fine-tuned transformer models using LoRA to generate structured, explainable risk narratives, reducing manual review time for underwriters.
• Collaborated with product and data teams to integrate ML outputs in prototype tools for a reinsurance market- place, supporting transparent risk comparison and portfolio-level coverage analysis.
• Built a FastAPI service to deliver JSON risk reports with configurable parameters and error handling.
• Developed ETL pipelines using Python and SQL, with batch processing and robust exception handling. Data Scientist at Transparency & Smart City Group, Tehran Mar 2018 - Jan 2019
• Reduced operational costs by 30% and improved fraud detection by processing and normalizing 25K+ financial contracts from two incompatible municipal systems using Python, statistical analysis, and data visualization.
• Enabled contractor performance tracking by designing a scoring algorithm using regression modeling and visual- izing insights from 100K+ monthly citizen reports.
• Supported project management activities by defining scope, goals, and deliverables; ensured accountability for a 5-person technical team.
• Conducted 30+ stakeholder meetings, performed feasibility assessments for technical alignment, and authored reports translating analytical results for non-technical decision makers. Projects
Parallel Computing Application (CUDA)
• Implemented GPU-based algorithms for vector addition and matrix multiplications on the UNITY cluster, achiev- ing up to 116 times faster performance over traditional CPU methods for large matrices. Oversaw matrix operations and ensuring consistent output validation across CPU and GPU versions. Predicting Venue Popularity with Machine Learning (Python)
• Developed regression, k-nearest neighbors, and ensemble models to predict venue popularity based on demographic, event, and location features; applied data analysis techniques to optimize decision-making outcomes. Stack Overflow Data Processing & Classification (PySpark)
• Engineered Spark ML pipelines to parse, clean, and transform 10 GB of XML data; trained Word2Vec embeddings to boost content categorization and summarization accuracy.
• Developed NLP classifiers in Python to predict Stack Overflow user tags and retention, achieving 87% precision and generating actionable insights to enhance user engagement strategies. Sentiment Analysis of Yelp Reviews (Python)
• Utilized Natural Language Processing to extract sentiment from Yelp reviews, predicting ratings and identifying polarizing words to provide actionable insights.
Distributed Systems Simulation (C++):
• Created a MapReduce model mimicking a multithreaded application, for faster data processing through paral- lelization and dynamic data partitioning. Achieved fault tolerance of up to 2 worker node failures out of 10, which was validated against Spark.
Optimized CNNs and Pruning on CIFAR-10 (PyTorch):
• Developed and optimized Convolutional Neural Networks (CNNs) on CIFAR-10, including pruning algorithms on ResNet18 to reduce training time while maintaining 91.16% accuracy and fine-tuned custom architectures with transfer learning to achieve 91% precision and 84% recall.
• Developed and compared one-shot and iterative Magnitude-Based CNN pruning algorithms, enhancing training efficiency by up to 91.16 percent accuracy on Resnet18 with the CIFAR10 dataset. Education
The Data Incubator Jan 2024 - Apr 2024
Certified Data Scientist - Fellowship Program
University of Massachusetts Amherst Sep 2021 - May 2023 M.Sc. in Computer Science - Concentration in Data Science Stanford Continuing Education Program Jan 2020 - Mar 2020 Certified in Programming and Data Analysis
University of Tehran Sep 2010 - May 2015
B.Sc. in Engineering Science - Computational Algorithms and Optimization