Sona Krishnan
Plainsboro, New Jersey **********@*****.*** +1-609-***-**** linkedin github
Education
University of Illinois at Urbana-Champaign
BS in Statistics and Computer Science GPA: 3.53
Aug 2021 – May 2025
Cornell University
Master of Engineering in Computer Science (Part-Time) Aug 2025 - May 2027
New York City, New York
• Relevant Coursework: Data Structures, Numerical Methods, Intro to Algs & Models of Comp, Intro to Computer Systems, Database Systems, Compilers, Data Mining, Artificial Intelligence, Statistical Modeling I & II. Work Experience
InvisiblCloud Remote
Data Engineer – GenAI & Platform
Jan 2024 – Present
• Built and managed data pipelines using Apache Airflow, orchestrating event-driven and scheduled workflows for multi-omics data processing.
• Designed Airflow DAGs for ETL pipelines, integrating PythonOperator, KubernetesPodOperator, and S3KeySensor to automate ML model training.
• Orchestrated AI/ML workloads on multi-cloud Kubernetes clusters, using Terraform for infrastructure provisioning.
• Implemented HPA & VPA to dynamically adjust AI model execution resources based on workload demand.
• Configured Ingress controllers for load balancing AI inference requests and ensured secure communication between microservices.
• Developed observability pipelines using Prometheus and Grafana to monitor DAG execution times and pod metrics. Northrop Grumman Palmdale, California
Data Science Intern
Jun 2023 – Aug 2023
• Designed a material selection recommendation system, increasing engineers’ decision-making speed by 15%.
• Built machine learning models with Keras and Scikit-Learn, achieving 90% accuracy through cross-validation techniques.
• Analyzed key performance indicators through EDA and implemented targeted optimizations reducing error rates by 25%.
Verde Finance Remote
Software Developer Intern
May 2022 – May 2023
• Built ETL pipelines with AWS Glue, reducing data processing time and ensuring seamless data integration.
• Developed financial scoring algorithms, achieving 85% accuracy compared to competitor benchmarks.
• Created serverless back-end systems using AWS Lambda and DynamoDB, improving real-time data processing efficiency.
Projects
Personalized Workout Scheduler SQL, Node.js, React.js Github
• Built a full-stack fitness application hosted on Google Cloud Platform with RESTful APIs to enable real-time CRUD operations for personalized workout plans, reducing user task completion time.
• Implemented efficient query optimization and state management for integration between front-end and back-end. Custom Dynamic Memory Allocator C Github
• Developed a custom memory management system in C, focusing on performance and efficient memory usage. Cardiovascular Risk Prediction Model Python, Scikit-Learn, Pandas Github
• Designed and implemented supervised learning models, including SVM, Logistic Regression, and Random Forest, achieving an 85.8% accuracy in predicting cardiovascular disease risk.
• Addressed class imbalance using SMOTE and optimized model performance with GridSearchCV.
• Evaluated models with metrics like ROC-AUC, F1-score, and confusion matrices. Skills
Languages: Python, C, C++, Java, R, SQL, NoSQL, HTML, CSS, Javascript, C# Tools & Libraries: AWS(BedRock, Lambda, Glue, DynamoDB), RAG, Azure, React.js, GraphQL, Flask, Apache Airflow, Spark, Angular.js, Scikit-Learn, PostgreSQL, MongoDB, Node.js, Neo4j, ElasticSearch, LLM Guardrails, Terraform Certifications: AWS Certified Cloud Practitioner, IBM Professional Machine Learning Certificate