Satya Mamidi
*****.********@*****.*** LinkedIn Medium GitHub Portfolio
Data Scientist with 5 years of experience in healthcare, banking, insurance and technology. Proven ability to develop and implement data science techniques, with a strong understanding of data science and NLP. Experience with data visualization, model governance, and security protocols. Strong communication skills and ability to explain technical concepts to a non-technical audience. Ability to analyze complex datasets, build predictive models, and communicate insights to stakeholders. Worked a production ready machine learning project using MLOps.
Skills
Programming Languages: Python, Java, R, SQL.
Tools: GitHub, Postman, MongoDB, DynamoDB, Docker, Kubernetes, AWS, GCP, Streamlit, Fast API, Lang Chain, Spark, Hadoop, Postgres SQL, MySQL, Airflow, Flask,Chainlit, JIRA, Data Bricks Frameworks/Libraries: TensorFlow, Keras, PyTorch, Hugging Face, NLTK, Spacy, Scikit-Learn, OpenCV, Matplotlib, Numpy, Dask, Visualization: Tableau, Power BI, Excel Charts, Jupiter Notebook Machine Learning: Regression, Classification, Clustering, Logistic Regression, Simple Linear Regression, Polynomial Regression, Decision Trees, Random Forest, Multiple linear Regression, K-Nearest Neighbors (KNN), SVM, Ada boosting, XG boosting. Concepts: Attention Mechanism, ETL, Deep Learning, Machine Learning, Pattern Recognition, Large Language Modelling, Prompting, Big Data Analysis, Software Development, Artificial Intelligence, Design Pattern, Code Reviews, System Design, Vector Database Research Skills: Hypothesis Design, Data Creation, Literature Reviews, Experimental Design, Statistical Evaluation, Technical Writing, A/B Testing, Share point
Cloud: AWS, Azure, AWS Redshift, Snowflake, Sage Maker, EC2, Lamda. Soft Skills: Communication, Collaboration, Critical Thinking, Active Learning, Time Management, Creativity, Leadership, Innovation Experience
NumPy Ninja – ML Engineer Sep2022 – Present
Dietitian Chatbot
I leveraged generative AI models like OpenAI, Hugging Face, and the LangChain framework to create an engaging chatbot experience. By using a retrieval-augmented generation RAG approach, I improved the chatbot's ability to deliver accurate and relevant responses. This integration significantly enhanced user interaction and accessibility to dietary advice. Maternal Health
I developed a data pipeline to analyze maternal ultrasounds, revealing a 30% increase in preeclampsia, reducing risk by 15% for high- risk patients, segmenting data to identify key risk factors, and creating interactive Power BI dashboards while communicating significant findings on hospitalization and fetal outcomes to stakeholders. NumPy Ninja – Data Scientist (Location: Remote)
Prediction of Sepsis
I performed root cause analysis on 2 million records in Apache Hadoop, collaborated with data engineers, and used PySpark for data collection and cleaning, while leveraging libraries like Matplotlib and seaborn for visualization, advanced SQL for data manipulation, and implemented the XGBoost algorithm with an AUC-ROC of 81, ultimately deploying the solution on Heroku using Streamlit. Predicting Cognitive impairment in Diabetes on Adult patients I worked on an end-to-end MRI data project with 1008 columns, achieving 77% accuracy using KNN imputation and a decision tree, performed random forest feature extraction, created age bins due to lack of correlation, and achieved over 90% accuracy with linear regression
Apple – Data Scientist (Location: Austin, TX) April2022 - Aug2022 Optimized NLP and text analytics pipelines and achieved 99.9% model accuracy in production while employing MLOps practices and reporting performance metrics using Tableau.
Yes Bank - Analyst Nov2013-March2016
I enhanced risk assessment reliability by establishing robust data collection and maintenance processes, resulting in a 20% improvement, while analyzing trends and insights and effectively presenting key findings to stakeholders Projects:
Interview Questions Creator The FastAPI application allows users to upload PDF files, processes them to generate questions and answers using an LLM, and saves the results in a CSV file using RAG .
Financial chatbot chat interface for a trading bot that responds to user queries by integrating data ingestion and a generation model.
Resume ATS The Stream lit app enables users to upload resumes, analyze them against job descriptions with generative AI, and receive feedback on suitability, missing keywords, and tailored resume generation Certifications
- IBM Data Science Professional Certificate in 2021. - Lang chain for LLM Application Development
- Introduction to Machine learning in 2021 - Hands on Data Engineering
- Generative AI: Working with Large Language Models Education
Master’s degree from Osmania University in Business Administration, Hyderabad 2011(Statistics, Operational research, Marketing, I.T). Bachelor’s in technology from JNTU in Electrical and Electronics Engineering, Hyderabad in 2008 Publication
Data Exploration with LLMs Standardization and Normalization Parametric and Nonparametric Regularization in ML Product Analysis with A/B Testing and Causal Inference: A Practical Approach