Post Job Free
Sign in

Data Science Machine Learning

Location:
Castro Valley, CA
Salary:
100000
Posted:
October 15, 2025

Contact this candidate

Resume:

APARNA ANANDKUMAR

510-***-**** ****************@********.*** www.linkedin.com/in/aparnaanandkumar EDUCATION & COURSEWORK

University of California, Berkeley August 2021 - May 2025

● Double Major: B.A. Molecular & Cell Biology with emphasis in Immunology and Molecular Medicine/ B.A. Data Science with an emphasis in Computational Methods in Molecular and Genomic Biology

● Biology Coursework: Molecular Immunology, Molecular Medicine Lab, Computational Molecular & Cell Biology, Functional Neuroanatomy, Genetics Genomics & Cell Biology, General Biology, Biotechnology Field and Industry, Biochemistry & Molecular Biology, General Chemistry, Organic Chemistry, Biology Lab, Chemistry & OChem Lab

● Data Science Coursework: Probability & Statistics, Data Structures, Structure & Interpretation of Computer Programs, Principles & Techniques of Data Science, Calculus II, Linear Algebra, Machine Learning & Data Analytics, Artificial Intelligence, Efficient Algorithms and Intractable Problems

TECHNICAL SKILLS

Bioinformatics Tools

Programming & Data

Science

Data Analysis &

Visualization

Wet Lab Skills

CD-HIT, MMseqs2, Bowtie2, Biopython, DNA Chisel, ESM, AlphaFold, PyMOL, RFDiffusion, AWS (EC2), GCP Python, Java, TypeScript, SQL, pandas, NumPy, scikit-learn, PyTorch, TensorFlow, FastAPI, RESTful API, Typescript, React, Docker, Jupyter Notebook, Git/GitHub, VSCode, RStudio, Cursor Statistical analysis, clustering, alignment, principal component analysis (PCA), k-means clustering, data wcleaning, heatmaps, histograms, scatter plots

Gel electrophoresis, PCR, restriction digest, transformation, transfection, cloning, primer design, ligation, miniprep, liquid inoculation

Projects

Machine Learning Analysis of Gene Expression Data - Python & Sklearn

● Designed and implemented predictive models (Ridge, Logistic Regression) to quantify and classify gene expression patterns

● Used PCA to condense thousands of features into principal components, revealing biological subpopulations

● Applied K-Means clustering to group cells with similar expression profiles Neural Network Architectures for Classification and Sequence Modeling - Python & PyTorch

● Built and trained ML models (Perceptron, Regression, CNN, RNN, Attention) from scratch in PyTorch, reaching 97%+ test accuracy on MNIST and 81%+ on multilingual language ID datasets

● Implemented training loops, gradient-based optimization, and efficient batching to handle large datasets

● Applied convolutional layers for image classification and recurrent networks for sequence tasks like variable-length classification

● Developed attention mechanisms and a mini character-level GPT, gaining experience with transformer-based architectures NGordnet (Google Ngram/ WordNet) - Java

● Engineered data structures to organize large datasets of word popularity, enabling fast retrieval with optimized time and memory efficiency

EXPERIENCE

Aikium Inc., Berkeley, CA May 2025 - Current

Bioinformatics Engineer – Computation Team

● Built a full-stack NGS analysis platform with a Next.js frontend and FastAPI backend wrapping my custom Python package

(ai-ngs-analysis) to simulate and visualize protein selection experiments

● Managed NGS pipelines processing 200–400M reads per run across 5 analysis modes, achieving on-time results on AWS EC2/GCP

● Develop and apply custom error-correction scripts that resolve premature stop codons and adjust mutated sequences to match reference genomes, generating multiple output files tailored for downstream analyses

● Develop high-quality data visualizations (heatmaps, histograms, statistical plots) and perform statistical analyses using Python and bioinformatics tools to generate interpretable metrics for R&D decision-making and cross-functional use

● Generate, process, and analyze sequencing outputs, collaborating with scientists and engineers to translate results into actionable insights that guide AI-driven design of strong binders for antibody and protein engineering, and support biosynthetic, protein design, and AI/ML teams

● Contribute to refining computational methods to scale and improve sequencing data throughput while ensuring compliance with best practices in data handling and analysis

Guiton Lab, Hayward, CA October 2019 – June 2021

Microbiology Lab Member

● Helped analyzing gene expression in Toxoplasma gondii, localization of ROP28

● Presented insights and critiques on research papers to grow the team’s understanding on Toxo and incite deeper thinking

● Mentored new members during their first month on lab protocols and methods



Contact this candidate