Anders Kiss
EMAIL: ************@*****.***
MOBILE: +1-470-***-****
https://www.github.com/anderche
https://www.linkedin.com/in/anders-kiss-92740a25/
https://public.tableau.com/profile/anders#!
PROFILE
A Data Scientist skilled in Large Language Models (LLM’s), Retrieval-Augmented Generation (RAG), machine learning, deep learning neural networks, data visualizations, and building data pipelines that solve business challenges. I have a Master of Science in Artificial Intelligence and a Master of Science in Biomedical Technology Development & Management, AWS (Amazon Web Services) Certified Machine Learning Engineer – Associate certification, and have experience utilizing Python, Generative-AI frameworks LangChain & Llama-Index, Computer Vision (CV), Natural Language Processing (NLP), time-series forecasting, fullstack app development, and model development from data cleaning/preprocessing to feature engineering and hyperparameter-tuning.
SKILLS
AWS, Azure, Databricks, Django, Docker, FastAPI, Flask, Git, GitHub, HTML5, JavaScript, JSON, Keras, MLflow, Neo4j, NLP, NumPy, Pandas, Power BI, PySpark, Python, R, Ruby on Rails, SageMaker, Scikit-Learn, Snowflake, Spark, SQL, Streamlit, Tableau, Tensorflow
EMPLOYMENT
Data Scientist 06/2024 to 11/2024
Baxter – remote
Built and deployed a Retrieval-Augmented Generation (RAG) LLM chatbot to help service technicians quickly identify error codes when fixing medical devices in the field. The app utilized FastAPI to handle API requests/responses and Streamlit for the frontend UI. LangChain was used for chunking/embedding service-manual files (.pdf) and for retrieving context from a Redis vectorstore database, Cohere-reranker was used to obtain the most relevant context, and Azure’s OpenAI was the LLM. Application was built in Python, containerized with Docker, and deployed to Azure App Service.
Built and optimized Long Short-Term Memory (LSTM’s) to predict revenue utilizing Databricks
Data Scientist 01/2024 to 06/2024
Booz Allen Hamilton – remote
Built a RAG-based LLM with the LangChain framework to summarize vectorized text datasets (e.g. National Institutes of Health grants and publications); technologies included Python, LangChain, AWS Bedrock (Llama-2, Claude, ChatGPT foundation models), embeddings (AWS Titan/HuggingFace) for text vectorization, Amazon RDS with PGVector, LangChain vectorstore, and Gradio/Streamlit frontends
Utilized pretrained LLM (HuggingFace) and NLP (e.g. sentiment analysis) for severity level labeling of vaccine adverse events for pharmacovigilance purposes visualized within Power BI
Presented management with results, data visualizations, and impact to business use-cases
Wrote government proposals regarding AI Data-Readiness frameworks
Data Scientist 10/2022 to 01/2024
Gemological Institute of America – remote
Built machine learning (ML) and neural network models for valuing diamonds; steps included hyperparameter tuning, architecture comparisons, and evaluation metrics (accuracy/precision for classifiers, RMSE and R-Squared for regressors)
Designed deep learning neural networks for Computer Vision object detection for diamond clarity as well as time-series forecasting of instrumentation failure with algorithms: time-series forecasting models utilizing Long Short-Term Memory (LSTM's), Gated Recurrent Unit (GRU's), and Prophet (Facebook), as well as CNN's, ResNet, DenseNet
Built numerical models with ML algorithms (Random Forest, XGBoost, Support Vector Machine)
Built Power BI / Tableau dashboards to track performance across regions the organization (Month-over-Month charts, matrix tables, filters, date slicers, etc.)
Extracted data from Snowflake and IBM Cloudant API’s before preprocessing data, boto3, S3, and SageMaker for model development
Performed exploratory data analysis to identify outliers, NaN values, statistical measures, and data visualizations (aggregations, barplots, heatmaps, Pearson correlations, etc.)
Presented management with findings (evaluation metrics, features, algorithms, architectures, graphs)
Consultant, EU Medical Device Regulation 11/2020 to 09/2021
Varian – remote
Tracked international medical devices registrations (oncological, embolization nanoparticles) using Python to identify shared requirements and expedite country-specific requirements (regarding Supply Chain, Engineering, Safety/Complaint data, and Marketing documentation)
Licensed the nanoparticles in the European Union (CE-Mark registration) collaborating alongside Engineering, Operations, Marketing, Product Management, and Sales departments
Consultant, EU Medical Device Regulation 09/2019 to 03/2020
Medtronic – remote
Submitted European Medical Device Regulation (MDR) and technical file (CE-Mark) dossiers for nerve-monitoring spinal products
Provided regulatory support in New Product Development meetings pertaining to IEC 60601-1 Basic Safety and Essential Performance, IEC 60601-1-2 EMC, IEC 62366 Usability, IEC 62304 Software deliverables
Regulatory Affairs Specialist 02/2017 to 09/2018
Johnson & Johnson – Irvine, California, U.S.A
Submitted 30x registrations for Europe, China, Japan, Russia, and Brazil for entire product lines
Drafted European Technical Files for Sterilizers and Disinfection capital equipment
Worked with Engineering, Marketing, Quality, and Operations to finalize registration deliverables
Senior Regulatory Affairs Specialist 03/2014 to 10/2015
PENTAX Medical – Montvale, New Jersey, U.S.A.
Wrote Clinical Evaluation Reports for endoscopes, colonoscopes, video processors and ENT devices
Licensed 50+ international product registrations
Performed gap analysis of European CE-Mark compliance (e.g. Restriction of Hazardous Substances)
Contributed to successful FDA Inspections & TÜV audits
EDUCATION
Master of Science, Artificial Intelligence 09/2021 to 09/2022
Dublin Business School - Dublin, Ireland
Coursework
Machine Learning & Pattern Recognition, Deep Learning, Reinforcement Learning, Recommender Systems, Natural Language Processing (NLP), Programming for Data Analysis, Graph Theory, Cognitive & Ethical Dimensions of AI
Dissertation
A comparison study of LSTM and GRU algorithms and a feature analysis of multivariate macroeconomic features pertaining to their capacity to improve accuracy of Bitcoin price predictions. https://anders-kiss.gitbook.io/modeling-digital-assets-with-deep-learning/
Projects
A data acquisition / preprocessing pipeline script to obtain stock market data, create technical indicator, with data persisting to a database (Microsoft SQL Server). https://github.com/Anderche/data-acquisition-and-preprocessing-pipeline
A Content-based recommender system (RS) script is for a content-based recommender system (RS) that I built. This basic system recommends an Airbnb listing to a User based on the description fields of a their historical stays. The code implementation compares recommendation accuracy of the BERT and TF-IDF algorithms. https://github.com/Anderche/content-based-recommendation-system
An NLP topic modeling application to better understand the major themes for that day’s news headlines by ingesting the Python API, NewsAPI. It can be found here: https://github.com/Anderche/Natural-Language-Processing_Topic-Modeling
A Python script that utilizes that targets communities from the MovieLens database (containing ~38,000 movies) via the graph theory (machine learning) algorithms, Label Propagation (Community Detection Algorithm), and K-Nearest Neighbors (Similarity Algorithm). https://github.com/Anderche/graph-theory-profitable-movie-ideas
Data Analytics Bootcamp 08/2020 to 10/2020
WeCloudData Online - online - https://www.wedlouddata.com
Power BI, Tableau, SQL, Python (financial analysis, web scraping), Marketing Analytics, Digital Analytics, Supply Chain Analytics, Healthcare Analytics, Business Analytics
Fullstack Web Development Program 01/2019 to 03/2019
Le Wagon - Milan, Italy - https://www.lewagon.org
Ruby on Rails, React.js, HTML5, CSS3, JavaScript, Flexbox, CSS Grid, jQuery, Authentication & Authorization mechanisms, Geocoding (MapBox), Search (Algolia), UX/UI design
Master of Science, Biomedical Technology Development & Mgmt. 08/2009 to 05/2011
Georgetown University - Washington, D.C., U.S.A.
Bioinformatics, Biomaterials, Project Management, Immunology, Pharmacology, Molecular Biotechnology, Preclinical Product Evaluation, Design & Conduct of Clinical Trials, Product Leadership, Legal & IP Issues, Commercial Development, Leadership & Innovation in a Tech Environment
Bachelor of Science, Biological Sciences 08/2005 to 05/2009
Virginia Polytechnic Institute & State University - Blacksburg, Virginia, U.S.A.
PROJECTS
ZAETAE, Inc.
Startup Chile accelerator to (Santiago, Chile) 10/2015 to 01/2017
I created an online marketplace to help patients find healthcare providers with surgical robotic and AI-based software diagnostics; awarded $30,000 grant from Startup Chile (www.startupchile.org)