Mulugeta Abebe
New York, NY
Phone:+1-678-***-****
Email: *********@*****.***
GitHub: github.com/Semework
LinkedIn: https://www.linkedin.com/in/mulugeta-semework-abebe/ Recent article: LLM workflows made easy: a practical guide CERTIFIED DATA SCIENTIST / SENIOR ML SOFTWARE ENGINEER SUMMARY
PhD neuroscientist with 12+ years leading interdisciplinary research and data science initiatives. Skilled at building end-to-end ML pipelines (feature engineering, model training, deployment, monitoring) in Python and SQL. Experienced in LLMs, GNNs, deep learning, and vector search, with real-world applications in biomedical imaging, financial sentiment analysis, and neurophysiology. An effective communicator who bridges data engineering, research, and clinical audiences. Demonstrated impact through peer-reviewed publications, grants, and cross-functional collaboration. PROFESSIONAL EXPERIENCE
Associate Research Scientist · Columbia University, Department of Neuroscience · New York, NY 01/2011 – Present
• Architected end-to-end ML pipelines for processing high-dimensional neural and imaging datasets, enabling predictive models of cortical neuron and memory responses.
• Led cross-functional teams to develop and validate a customizable multimodal imaging compound (patented), improving anatomical landmark visualization across X-ray, CT, and MRI modalities.
• Spearheaded design and fabrication of a primate MRI RF coil, enhancing signal-to-noise ratio by ~25% and reducing scan time and cost by ~25%, adopted center-wide.
• Applied advanced signal processing and ML methods to correlate spatial memory tasks with neural recordings and MRI data.
Senior Software Engineer · DNAnexus · Mountain View, CA 07/2022 – 08/2023
• Developed cloud-native ML pipelines for biomedical data—automating ingestion, preprocessing, and visualization of imaging-derived features.
• Deployed containerized AI inference services (TensorFlow) for genomic and imaging data.
• Assisted engineers on reproducible research, data ETL, and CI/CD best practices for high- volume datasets.
Page 2 3
• Collaborated with cross-functional teams to integrate imaging biomarkers into predictive treatment outcome models.
Senior Data Scientist / Consultant, AI & ML · Dots & LabCorp · New York, NY 09/2018 – 07/2019
• Developed NLP and vision-based deep learning models to extract clinical attributes from unstructured imaging reports, enhancing downstream analytics for patient stratification.
• Applied TensorFlow to develop AI solutions for customer segmentation, prediction, and solving computational bottlenecks in biomedical data. Postdoctoral Research Scientist · Columbia University, Department of Neuroscience · New York, NY 09/2014 – 09/2016
• Applied advanced signal processing and ML methods to correlate spatial memory tasks with neural recordings and MRI data.
• Designed animal training protocols and imaging experiments to map functional circuitry involved in environmental memory.
• Applied deep learning and statistical modeling to cortical datasets; published 8+ first-author papers.
Graduate student (PhD) · SUNY Downstate Medical Center · Brooklyn, NY
• Published and presented novel methods for integrating electrical microstimulation data with anatomical imaging to inform neuroprosthetic development.
• Designed and implemented a MATLAB GUI to import DICOM and other imaging datasets, perform automated thresholding, edge detection, and generate interactive 3D reconstructions for advanced visualization and analysis.
EDUCATION
• PhD in Neuroscience, SUNY Downstate Medical Center · Brooklyn, NY · 2011
• MS in Neurobiology, Georgia State University · Atlanta, GA · 2000
• Certified Data Scientist, The Data Incubator · 2021 SKILLS & TECHNOLOGIES
Machine Learning & AI
• Deep Learning (CNNs, RNNs), Graph Neural Networks (PyTorch Geometric)
• Natural Language Processing & Generative AI (Hugging Face Transformers)
• Feature Engineering, Hyperparameter Tuning, Model Deployment at Scale Page 3 3
Data Engineering & Infrastructure
• Python (scikit-learn, pandas, NumPy), R, Spark, SQL/NoSQL
• ETL Pipelines, Docker, Kubernetes, Apache Airflow
• Cloud Platforms: AWS (S3, EC2, SageMaker), Snowflake Biomedical Imaging & Analysis
• MRI/CT/X-ray processing, RF coil design, imaging contrast agents
• Neural data acquisition and experiment protocol design
• Statistical Analysis in Python/R; Data Visualization with matplotlib, Plotly, Tableau Natural Language & Financial Data
• Sentiment analysis of financial news using LLMs (GPT, Alpaca, BERT)
• Retrieval-augmented generation (RAG), embeddings, vector similarity search COLLABORATIVE LEADERSHIP
• Agile methodologies, cross-functional team leadership
• Mentorship & coaching of data science practitioners
• Technical presentations to diverse stakeholders
SELECTED PROJECTS & PUBLICATIONS
• GNN regression for molecular lipophilicity (GitHub)
• Financial news sentiment analysis using Alpaca, BERT, GPT (GitHub)
• Francis, J. T. et al. (2022). Similarities Between Somatosensory Cortical Responses… Frontiers in Neuroscience
• Taghizadeh, B. et al. (2020). Reward uncertainty affects information transmission… Communications Biology
PATENT
• Invented a unique imaging compound for enhanced contrast in X-ray, CT, and MRI (U.S. Patent No. PCT/US2015/034360).