SAINATH DASA
*******.******@*****.*** 203-***-**** LinkedIn GitHub Portfolio
SUMMARY
4+ years of experience as a Data Analyst and AI Engineer, delivering data-driven solutions across retail, healthcare, and energy sectors. Specialized in statistical modeling, ML/Gen AI development, and intelligent automation using Python, SQL, and cloud platforms like AWS and GCP. Proven track record in optimizing data workflows, building LLM-powered applications, and designing interactive dashboards that turn complex data into strategic insights. PROFESSIONAL EXPERIENCE
Data Engineer, Capgemini Hyd, India. Apr ‘21 – Jul ‘23
Designed and deployed scalable data pipelines using AWS (S3, Lambda, Glue, EC2) and Google BigQuery to streamline data delivery for analytics and ML workflows.
Built and optimized ML models (classification, regression, clustering) using Scikit-learn and TensorFlow, achieving up to 90% accuracy.
Led data preprocessing and feature engineering for ADAS projects involving image, video, and sensor data.
Developed end-to-end ML pipelines including model training, tuning, evaluation, and deployment via Flask APIs and AWS.
Managed centralized data lakes and feature stores to support consistent, scalable access to ML-ready datasets.
Collaborated with data scientists to deliver production-ready datasets using Jupyter, Docker, and Git.
Created interactive dashboards using Power BI, Tableau, and Python libraries for EDA and stakeholder insights.
Contributed to CI/CD automation and followed Agile workflows for seamless data pipeline deployments. Data Analyst, Amazon Hyd, India. Jul ‘19 - Apr ‘21
Led NLP-driven projects to analyze and structure large-scale text data, improving internal ML models and document understanding systems.
Developed interactive dashboards (Power BI) and automated Excel reports to track data quality, model accuracy, and business KPIs.
Performed EDA using Python (Pandas, NumPy) & SQL, collaborated on training data preparation, labeling standards, and feature validation.
Contributed to Flask API deployment and SageMaker-based NLP prototypes, optimizing transformer models through data coverage analysis and performance evaluation.
PROJECTS
Agentic Stock Trading Assistant Python, LangChain, Streamlit, Excel
Developed an end-to-end AI agent that reads livestock holdings from Excel, analyzes portfolio performance (P&L, ROI, distribution), and delivers daily insights via Streamlit dashboards, including visualizations and rule-based commentary.
Integrated LangChain tools for top/worst performer detection, scheduled daily automation, and a natural-language query tab that allows users to ask investment questions like “Which stock should I sell?” Generative AI Image Analysis Dashboard Python, Hugging Face, OpenAI, Flask, Power BI
Created a full-stack solution for visualizing AI-generated images using Stable Diffusion and DALL·E, with a backend API built in Flask and real-time inference tracking deployed on AWS EC2.
Designed Power BI dashboards to compare model outputs, visualize prompt evolution, and log inference metrics (model, time, quality) using structured metadata.
Personal Portfolio Website HTML, CSS, JavaScript, SCSS, GitHub
Designed and deployed a responsive portfolio site to showcase technical projects, experience, and GitHub repos using HTML/CSS/JS and automated version control with GitHub.
Features detailed project pages, live demos, and mobile-first design, built & maintained through Visual Studio with auto-deploy Git scripts. Amazon Alexa Sentiment Analysis Python, NLTK, Scikit-learn.
Built a sentiment classification model using Naive Bayes and TF-IDF to analyze Alexa product reviews, with detailed preprocessing, EDA, and visualization.
Achieved strong model performance (precision/recall) and planned real-time deployment using Flask or Streamlit. TECHNICAL SKILLS
Programming Languages & Tools: Python, SQL, R, Java, C/C++, JavaScript, HTML/CSS, Bash, Git, Docker Data Science & Machine Learning: Supervised/Unsupervised Learning, Deep Learning, Feature Engineering, HyperparameterTuning, Model Evaluation, Time Series (ARIMA, PCA), Predictive Modeling, NLP (NER, Text Classification, Sentiment Analysis), LLM Fine-Tuning, Prompt Engineering, Embeddings, Generative AI (DALL·E, Stable Diffusion), Transformers, RAG
Frameworks & Libraries: Scikit-learn, TensorFlow, Keras, PyTorch, XGBoost, LightGBM, Transformers, Hugging Face,OpenAI API, Pandas, NumPy, Matplotlib, Seaborn, Flask, React
Cloud & DevOps: AWS (S3, EC2, Lambda, SageMaker, Glue), Azure, GCP, Apache Airflow, Jenkins, GitHub, CI/CD Databases & Visualization Tools: PostgreSQL, MySQL, MongoDB, Power BI, Tableau, Excel, Jupyter Notebook Other Skills: Data Cleaning/Wrangling, Annotation (Text, Image, Video), OCR, Metadata Tagging, Agile, Workflow, REST API Development EDUCATION
Wilmington University, DE — M.S. Information Systems & Technologies 3.8/4.0 June ‘25 Relevant Coursework: Networking & Data Communications, Design, Data Modeling, Information Systems Management, IT ProjectManagement Hindi Mahavidyalaya College, IN — M.Sc. Applied Statistics 8.0/10 May ‘21 Relevant Coursework: Probability Theory, Statistical Inference, Regression Analysis, Time Series Analysis, Multivariate Analysis, Statistical Quality Control Bankatlal Badruka College, IN — B.Sc. Mathematics, Statistics, and Computer Science 8.6/10 June ‘19 Relevant Coursework: Data Structures and Algorithms, Operating Systems, DBMS, OOPS(Java), Software Engineering, Statistical Computing, Linear Algebra CERTIFICATIONS
Python with Machine Learning Certification — Completed a 6-month professional training program Feb ‘25 Python for Data Science Bootcamp — Udemy, Feb ‘24
Basic SQL Certification — Coursera, Nov ‘23
Blue Prism Foundation Training — Coursera, Dec ‘22 Workshop on Statistical Analysis Using R — Organized by Hindi Mahavidyalaya College, Feb ‘21