SUMMARY
Vishnu Reddy Bijjam
AI Engineer & Data Scientist
*******************@*****.*** +1-607-***-**** USA Linkedin GitHub AI Engineer & Data Scientist with 4 years of experience building and deploying machine learning models and scalable data pipelines in healthcare and finance. Expertise in deep learning, NLP and real-time analytics using AWS SageMaker, Azure Machine Learning, and PySpark, with strong cloud-native engineering skills on AWS and Azure. Delivered predictive analytics, risk detection, and automated monitoring solutions, integrating outputs into interactive dashboards (Power BI, QuickSight, Tableau) for data-driven decision-making. Certified Databricks Data Engineer Associate, AWS Certified Data Engineer Associate, and AWS Certified Machine Learning – Specialty. PROFESSIONAL EXPERIENCE
Evidation Health, USA, AI Engineer Dec 2024 – Current Project: Patient Behavior Analytics & Real-Time Monitoring Engineered and deployed end-to-end ML pipelines in AWS SageMaker for wearable, EHR, and survey data to predict chronic condition progression and adherence scoring.
Designed LSTM and Transformer-based deep learning models for time-series health data, increasing early risk detection accuracy by 32%.
Built NLP pipelines using BioBERT and spaCy to classify patient notes, extract medical concepts, and detect behavioral shifts in unstructured text.
Developed real-time ingestion of 400K+ IoT events/day via AWS Kinesis and Lambda into S3, transforming with PySpark in AWS Glue for ML-ready datasets.
Structured feature engineering workflows with partitioned Delta tables, enabling scalable joins, aggregations, and incremental updates for model training.
Implemented automated retraining, validation, and deployment via SageMaker Pipelines, including A/B testing and model drift monitoring.
Delivered QuickSight dashboards accessed by 150+ clinicians, reducing time-to-insight from 2 days to under 30 minutes. Maintained 100% HIPAA compliance with auditable ML workflows, integrating explainability tools to support regulatory reviews.
American Express, Data Scientist/Data Engineer Aug 2020 – Jul 2023 India Project: Financial Risk Analytics & Transaction Surveillance
• Designed and delivered ETL pipelines in Azure Data Factory and IBM DataStage to handle multi-terabyte daily financial transactions from global payment systems, ensuring regulatory compliance and 99.9% data accuracy.
• Built and deployed fraud detection and AML risk scoring models using PySpark MLlib and scikit-learn, boosting anomaly detection rates by 28% while reducing false positives.
• Leveraged Azure Machine Learning for model training, hyperparameter tuning, and deployment across batch and real- time scoring workflows.
• Created feature engineering pipelines in dbt and PySpark to produce model-ready datasets, cutting preprocessing time by 35%.
• Tuned Snowflake performance with clustering, caching, and partitioning strategies, reducing query runtimes by 40% and lowering compute costs.
• Unified 7+ disparate data sources (including SWIFT messages, payment gateways, and trade systems) into a centralized Delta Lake, enabling consistent analytics and model training.
• Designed and launched Power BI dashboards that visualize real-time risk metrics and compliance KPIs, speeding up fraud investigations by 25%.
• Automated CI/CD workflows in Azure DevOps for ADF, dbt models, and ML deployments, ensuring consistent releases across dev, test, and prod environment
PROJECTS
Customer Retention Prediction using Databricks Pipelines & MLflow Developed a data processing pipeline in Azure Databricks using PySpark to ingest, clean, and aggregate simulated transactional and customer interaction data from Azure Data Lake, creating ML-ready datasets. Trained and tracked churn prediction models (Logistic Regression, XGBoost) with MLflow for experiment logging, hyperparameter tuning, and maintaining reproducible workflows. Utilized MLflow Model Registry for version control and deployed a batch inference process in Databricks Jobs to generate churn scores, validating performance through accuracy and ROC-AUC metrics. Fake News Detection Using Machine Learning & Deep Learning Applied NLP techniques (TF-IDF, word embeddings, transformers) for text preprocessing, feature extraction, and preparation of clean training datasets from multiple news sources. Trained and compared Logistic Regression, Random Forest, SVM, LSTM, and BERT models; optimized inference latency via parallel matrix computations, reducing prediction time from 120ms to 35ms per input. Evaluated models using precision, recall, F1-score, and ROC-AUC, selecting the most balanced and high-performing model for deployment.
SKILLS
Programming & Frameworks: Python, SQL, PySpark, Pandas, NumPy, TensorFlow, Transformers, Scikit-learn, Matplotlib Machine Learning & NLP: Regression, Classification, Clustering, Anomaly Detection, Time Series Forecasting, Deep Learning
(LSTM, GRU, Transformer-based models, BERT, BioBERT, RoBERTa), NLP, Model Optimization & Compression, Hyperparameter Tuning
Data Engineering & ETL: Apache Airflow, dbt, Azure Data Factory, AWS Glue, Kafka, Databricks, Delta Lake, Hadoop ecosystem Data Analytics & Modeling: Data cleaning, validation, KPI tracking, statistical analysis, A/B testing, time series forecasting, predictive modeling, exploratory data analysis
Data Visualization & BI Tools: Power BI, Tableau, Excel, Looker Studio Cloud Platforms & Databases: Azure Synapse, Azure Blob Storage, AWS S3, Redshift, Snowflake, MySQL, PostgreSQL, MongoDB
DevOps & CI/CD: GitHub Actions, Azure DevOps, Jenkins, Docker, Kubernetes Collaboration & Monitoring: JIRA, Confluence, Jupyter Notebook, Anaconda EDUCATION
Master of Science in Data Science Sep 2023 – May 2025 USA Pace University
Bachelor of Technology in Civil Engineering Jul 2017 – Jul 2021 Hyderabad, India Gokaraju Rangaraju Institute of Engineering and Technology CERTIFICATIONS
AWS Certified Machine Learning Specialty
AWS Certified Data Engineer Associate
Databricks Data Engineer Associate - Databricks
DP-700: Microsoft Fabric Data Engineer