Jaswanth Kolisetty
551-***-**** ******************@*****.*** LinkedIn Github
Professional Summary
Software & ML Engineer with 2+ years of experience delivering scalable, cloud-native platforms for machine learning, time series analytics, and LLM-powered applications. Proven track record in building ML pipelines, microservices, and backend APIs using Python, FastAPI, Spark, and Docker. Adept at deploying production-grade systems on AWS and GCP. Collaborative, product-minded, and ready to own the full stack from data ingestion to model delivery and analytics UI. Education
Stevens Institute of Technology August 2023 – December 2024 Master of Science in Computer Science (GPA: 3.9 / 4.00) Hoboken, USA
• Relevant Coursework: Cloud Computing, Database Management Systems, Agile Methods, Knowledge Discovery and Data Mining SRM University July 2019 – June 2023
Bachelor of Technology in Computer Science (GPA: 3.8 / 4.00) Guntur, India
• Relevant Coursework: Data Science, Web Mining, Database Management Systems, Information Retrieval, Java Programming Technical Skills
Programming & Databases: Python, Java, R, SQL, Spark SQL, PostgreSQL, NoSQL, MongoDB Data Engineering: PySpark, Kafka, Spark, Hadoop, Airflow, ETL Pipelines, Data Modeling, Snowflake, Parquet, Avro Cloud & DevOps:AWS (EC2, MSK, Lambda, ECS, S3, RDS, SNS, SQS), GCP, Docker, Kubernetes, Terraform, CI/CD (GitHub Actions), Git Machine Learning: Scikit-learn, TensorFlow, PyTorch, BiomedCLIP, MLflow, Sagemaker, Vertex AI, Model Monitoring NLP & GenAIs: LLM Fine-tuning, LangChain, HuggingFace, RAG, spaCy, OCR-to-NLP, Prompt Engineering, Sentence Transformers Backend/API: FastAPI, Flask, RESTful APIs, GraphQL, WebSockets, Microservices Analytics & Insights: Statistical Modeling, A/B Testing, Power BI, Data Visualization, Business Intelligence Tools Experience
Trigyan February 2024 – Present
AI/ML & Data Engineer New Jersey, USA
• Designed 5+ production-grade LLM + RAG systems using LangChain, FAISS, and GraphDB; deployed using FastAPI and GraphQL.
• Engineered MLOps pipelines with MLflow to automate deployment on AWS ECS, reducing model downtime by 40%.
• Built a lung cancer segmentation and classification model using PyTorch, achieving 90% accuracy across CT/MRI data.
• Integrated PostgreSQL and S3 using Airflow-based ETL orchestration, enabling near real-time reporting pipelines.
• Architected data models ensuring schema consistency and resilience for analytics across multiple departments. SRM University January 2023 – June 2023
Research Analyst Guntur, India
• Designed a security model for Hadoop using Hyperledger Fabric, enabling immutable audit trails and enhancing HDFS integrity.
• Engineered a distributed Spark-based implementation of the Apriori algorithm on Hadoop to analyze seasonal retail trends.
• Processed over 20M+ transaction logs using PySpark and Created interactive Power BI dashboards to translate insights.
• Authored two IEEE papers on data mining and Hadoop architectures showcasing expertise in large-scale data analytics. 360 Research Foundation January 2022 – December 2022 Data Engineer Intern Guntur, India
• Designed AWS serverless pipelines using Lambda and Step Functions to process large-scale clinical data sets.
• Developed FastAPI-based ETL services with RBAC, retry logic, and S3-based file management for healthcare analytics.
• Achieved 60% efficiency gain in data ingestion through automated cleansing and transformation pipelines. Projects
RAG Analytics Assistant FastAPI, LangChain, HuggingFace, Transformers, Google Compute Engine
• Created a GenAI-powered document assistant integrating LLaMA3-8B via RAG for context-aware Q&A, deployed using Ollama on GCP Compute Engine for hardware-accelerated inference.
• Enabled fallback logic and semantic search with sentence-transformers, improving response accuracy by 30%. Big Data & ETL Pipeline Spark, Hadoop, PySpark, PostgreSQL, S3
• Built scalable data pipelines for 100M+ records using PySpark and deployed on AWS.
• Automated data ingestion and transformation, cutting latency for BI teams by 50%. Document OCR & Key-Value Extraction doctr, PyTorch, FastAPI, Tesseract, spaCy
• Built an OCR pipeline using doctr backed by PyTorch to extract structured fields from scanned clinical forms and IDs
• Integrated BiomedParse LLM as a fallback for low-confidence OCR output, enabling accurate recovery of biomedical entities Certifications
• AWS Certified Solutions Architect – Associate — Mar 2025 to Mar 2028
• LangChain: Chat with Your Data — DeepLearning.AI — Feb 2025
• Multimodal LLaMA 3.2 — DeepLearning.AI — Jan 2025