Post Job Free
Sign in

Bioinformatics Engineer - Data-Driven Drug Discovery and Risk Analysis

Location:
Boston, MA
Posted:
April 13, 2026

Contact this candidate

Resume:

Manju Selvakumaran

Boston, MA, USA ************.*@************.*** 617-***-**** www.linkedin.com/in/binf-manju-selvakumaran Education https://github.com/Manju-Selvakumaran U.S. Work Authorized F-1 OPT Northeastern University Jan. 2024 - Apr. 2026 (Exp.) MS in Bioinformatics GPA: 3.52/4.0

• Coursework: Intro to Computational Methods in Bioinformatics, Omics in Bioinformatics, Ethics in Biological research, Algorithms, Biotech Enterprise, Bioinformatics Programming, Statistics for Bioinformatics, Genomics in Bioinformatics, AI & ML in Drug Discovery. PSG College of Technology May 2022

Bachelor of Engineering in Electrical and Electronics Engineering GPA: 3.40/4.0 Professional Experience

Bioinformatics Engineer, Varosync Jan. 2026 - Present

• Architected a 4-stage pharmacovigilance pipeline resolving FAERS free-text drug names to ChEMBL IDs via text normalization, API lookup, RxNorm crosswalk, and fuzzy matching, achieving an 82.2% resolution rate across 29,642 unique drug entries.

• Deployed production bioinformatics pipeline on HPC cluster using SLURM, processing large-scale adverse event datasets from the full historical FAERS archive ( 84 quarters).

• Scaled data ingestion by migrating from live API calls to a local ChEMBL SQLite dump, eliminating rate-limiting bottlenecks and enabling offline batch processing of millions of drug records. TA for Intro to Computational Methods in Bioinformatics, Northeastern University Sep. 2024 - Apr. 2026

• Mentored 30+ graduate students through NGS workflows, GATK variant calling, and RNA-seq pipelines via workshops, code reviews, and office-hour debugging sessions, driving measurable improvements in assignment completion rates and reinforcing reproducible research practices across cohorts.

Information Technology Risk Analyst, Ernst & Young - Bengaluru, IN Jul. 2022 - Aug. 2023

• Conducted quantitative risk assessments across 90+ IT controls using data-driven analysis, earning an EY Spot Award for analytical rigor and client impact.

Research Intern, Rhogenites Biotech, National Centre for Biological Sciences – Bengaluru, IN Mar. 2022 - Jun. 2022

• Optimized bacterial culture protocols and validated strain characteristics through molecular assays (qPCR, DNA extraction, gel electrophoresis), supporting therapeutic microbiome research pipeline development.

• Quantified gene expression and protein concentrations across biological replicates, maintaining GLP-compliant electronic records documenting experimental protocols and QC metrics. Projects

Protein Function Classifier: Machine Learning solution for enzyme classification (Github ) Dec. 2025 - Jan. 2026

• Built an end-to-end ML pipeline to predict enzyme functional class (EC 1-7) from protein amino acid sequences, achieving 60% accuracy across 7 classes using XGBoost (4x better than random baseline).

• Engineered 437 bioinformatics features including amino acid composition, dipeptide frequencies, physicochemical properties, and secondary structure propensities from 5,200+ protein sequences sourced via UniProt REST API.

• Evaluated and compared 4 ML models (Logistic Regression, Random Forest, Gradient Boosting, XGBoost), analyzing precision-recall trade-offs and identifying overfitting using train-test performance gaps.

• Deployed interactive web application using Streamlit enabling real-time protein classification with confidence scores, demonstrating full-stack ML development from data acquisition to production deployment. Cancer Evolution Analysis: Phylogenetic Study of Breast Cancer Subclones (Github ) Nov. 2025 - Dec. 2025

• Built and containerized an automated single-cell RNA-seq analysis pipeline using Docker, processing 4,992 cells and 33,538 genes, reducing manual analysis time by implementing Scanpy-based workflow for quality control, normalization, and clustering.

• Discovered 7 distinct cancer subclones through high-resolution clustering, and characterized their evolutionary relationships, identifying sister clones differing by only 99 genes and highly divergent pairs with 3,983 differentially expressed genes. .

• Quantified evolutionary rates across all subclone pairs, establishing a strong statistical correlation (r=0.931, p<0.001) between phylogenetic distance and transcriptional divergence, validating computational approach to cancer evolution reconstruction.

• Identified clinically-relevant aggressive phenotype (SC6) through integrated pathway analysis showing elevated EMT (0.255), metastasis

(0.474), and ECM remodeling (1.196) scores, and immunogenic subclone (SC3) with unique immune signatures. TruBridge Healthcare Data Analytics Externship Aug. 2025 - Oct. 2025

• Analyzed NHANES health survey data (N=1,293) examining BMI-diabetes relationships using Python with non-parametric statistical methods (Kruskal-Wallis, Spearman correlations) and comprehensive visualization.

• Conducted multivariate analysis of insulin-dependent patients (n=327) investigating relationships between BMI, medication patterns, and disease characteristics through hypothesis testing and effect size calculations.

• Produced comprehensive research report with publication-quality visualizations demonstrating 85% distribution overlap between diabetes groups, challenging conventional assumptions about BMI-diabetes progression. Technical Skills and Certifications

• Bioinformatics: scRNA-seq, NGS analysis, variant calling, epigenomics (ATAC-seq, ChIP-seq), multi-omics integration, cancer genomics, differential expression, pathway analysis, phylogenetics.

• Programming: Python, R, Bash/Linux, SQL, Nextflow, scikit-learn, XGBoost, Streamlit.

• Tools: BWA, STAR, HISAT2, GATK, DeepVariant, DESeq2, edgeR, Seurat, Scanpy, SAMtools, FastQC, MultiQC, Docker, Git/GitHub, AWS, HPC/SLURM.

• Statistics: Hypothesis testing, ANOVA, regression, non-parametric tests, UMAP, PCA, multiple testing corrections, Monte Carlo simulations.

• Lab: qPCR, DNA/RNA extraction, gel electrophoresis, bacterial culture, spectrophotometry.

• Certifications: Science & Business of Biotechnology (MITx), AlphaFold practical guide (EMBL-EBI), Healthcare Data Analytics (TruBridge).



Contact this candidate