Post Job Free
Sign in

Data Scientist Software Engineer

Location:
New York City, NY
Salary:
150000
Posted:
June 11, 2025

Contact this candidate

Resume:

Page * *

Mulugeta Abebe

New York, NY

Phone:+1-678-***-****

Email: *********@*****.***

GitHub: github.com/Semework

LinkedIn: https://www.linkedin.com/in/mulugeta-semework-abebe/ Recent article: LLM workflows made easy: a practical guide CERTIFIED DATA SCIENTIST / SENIOR ML SOFTWARE ENGINEER SUMMARY

PhD neuroscientist with 12+ years leading interdisciplinary research and data science initiatives. Skilled at building end-to-end ML pipelines (feature engineering, model training, deployment, monitoring) in Python and SQL. Experienced in LLMs, GNNs, deep learning, and vector search, with real-world applications in biomedical imaging, financial sentiment analysis, and neurophysiology. An effective communicator who bridges data engineering, research, and clinical audiences. Demonstrated impact through peer-reviewed publications, grants, and cross-functional collaboration. PROFESSIONAL EXPERIENCE

Associate Research Scientist · Columbia University, Department of Neuroscience · New York, NY 01/2011 – Present

• Architected end-to-end ML pipelines for processing high-dimensional neural and imaging datasets, enabling predictive models of cortical neuron and memory responses.

• Led cross-functional teams to develop and validate a customizable multimodal imaging compound (patented), improving anatomical landmark visualization across X-ray, CT, and MRI modalities.

• Spearheaded design and fabrication of a primate MRI RF coil, enhancing signal-to-noise ratio by ~25% and reducing scan time and cost by ~25%, adopted center-wide.

• Applied advanced signal processing and ML methods to correlate spatial memory tasks with neural recordings and MRI data.

Senior Software Engineer · DNAnexus · Mountain View, CA 07/2022 – 08/2023

• Developed cloud-native ML pipelines for biomedical data—automating ingestion, preprocessing, and visualization of imaging-derived features.

• Deployed containerized AI inference services (TensorFlow) for genomic and imaging data.

• Assisted engineers on reproducible research, data ETL, and CI/CD best practices for high- volume datasets.

Page 2 3

• Collaborated with cross-functional teams to integrate imaging biomarkers into predictive treatment outcome models.

Senior Data Scientist / Consultant, AI & ML · Dots & LabCorp · New York, NY 09/2018 – 07/2019

• Developed NLP and vision-based deep learning models to extract clinical attributes from unstructured imaging reports, enhancing downstream analytics for patient stratification.

• Applied TensorFlow to develop AI solutions for customer segmentation, prediction, and solving computational bottlenecks in biomedical data. Postdoctoral Research Scientist · Columbia University, Department of Neuroscience · New York, NY 09/2014 – 09/2016

• Applied advanced signal processing and ML methods to correlate spatial memory tasks with neural recordings and MRI data.

• Designed animal training protocols and imaging experiments to map functional circuitry involved in environmental memory.

• Applied deep learning and statistical modeling to cortical datasets; published 8+ first-author papers.

Graduate student (PhD) · SUNY Downstate Medical Center · Brooklyn, NY

• Published and presented novel methods for integrating electrical microstimulation data with anatomical imaging to inform neuroprosthetic development.

• Designed and implemented a MATLAB GUI to import DICOM and other imaging datasets, perform automated thresholding, edge detection, and generate interactive 3D reconstructions for advanced visualization and analysis.

EDUCATION

• PhD in Neuroscience, SUNY Downstate Medical Center · Brooklyn, NY · 2011

• MS in Neurobiology, Georgia State University · Atlanta, GA · 2000

• Certified Data Scientist, The Data Incubator · 2021 SKILLS & TECHNOLOGIES

Machine Learning & AI

• Deep Learning (CNNs, RNNs), Graph Neural Networks (PyTorch Geometric)

• Natural Language Processing & Generative AI (Hugging Face Transformers)

• Feature Engineering, Hyperparameter Tuning, Model Deployment at Scale Page 3 3

Data Engineering & Infrastructure

• Python (scikit-learn, pandas, NumPy), R, Spark, SQL/NoSQL

• ETL Pipelines, Docker, Kubernetes, Apache Airflow

• Cloud Platforms: AWS (S3, EC2, SageMaker), Snowflake Biomedical Imaging & Analysis

• MRI/CT/X-ray processing, RF coil design, imaging contrast agents

• Neural data acquisition and experiment protocol design

• Statistical Analysis in Python/R; Data Visualization with matplotlib, Plotly, Tableau Natural Language & Financial Data

• Sentiment analysis of financial news using LLMs (GPT, Alpaca, BERT)

• Retrieval-augmented generation (RAG), embeddings, vector similarity search COLLABORATIVE LEADERSHIP

• Agile methodologies, cross-functional team leadership

• Mentorship & coaching of data science practitioners

• Technical presentations to diverse stakeholders

SELECTED PROJECTS & PUBLICATIONS

• GNN regression for molecular lipophilicity (GitHub)

• Financial news sentiment analysis using Alpaca, BERT, GPT (GitHub)

• Francis, J. T. et al. (2022). Similarities Between Somatosensory Cortical Responses… Frontiers in Neuroscience

• Taghizadeh, B. et al. (2020). Reward uncertainty affects information transmission… Communications Biology

PATENT

• Invented a unique imaging compound for enhanced contrast in X-ray, CT, and MRI (U.S. Patent No. PCT/US2015/034360).



Contact this candidate