Data Scientist

Location:

Jersey City, NJ, 07307

Salary:

85000

Posted:

September 10, 2025

Contact this candidate

Resume:

SUMMARY

Vishnu Reddy Bijjam

AI Engineer & Data Scientist

*******************@*****.*** +1-607-***-**** USA Linkedin GitHub AI Engineer & Data Scientist with 4 years of experience building and deploying machine learning models and scalable data pipelines in healthcare and finance. Expertise in deep learning, NLP and real-time analytics using AWS SageMaker, Azure Machine Learning, and PySpark, with strong cloud-native engineering skills on AWS and Azure. Delivered predictive analytics, risk detection, and automated monitoring solutions, integrating outputs into interactive dashboards (Power BI, QuickSight, Tableau) for data-driven decision-making. Certified Databricks Data Engineer Associate, AWS Certified Data Engineer Associate, and AWS Certified Machine Learning – Specialty. PROFESSIONAL EXPERIENCE

Evidation Health, USA, AI Engineer Dec 2024 – Current Project: Patient Behavior Analytics & Real-Time Monitoring Engineered and deployed end-to-end ML pipelines in AWS SageMaker for wearable, EHR, and survey data to predict chronic condition progression and adherence scoring.

Designed LSTM and Transformer-based deep learning models for time-series health data, increasing early risk detection accuracy by 32%.

Built NLP pipelines using BioBERT and spaCy to classify patient notes, extract medical concepts, and detect behavioral shifts in unstructured text.

Developed real-time ingestion of 400K+ IoT events/day via AWS Kinesis and Lambda into S3, transforming with PySpark in AWS Glue for ML-ready datasets.

Structured feature engineering workflows with partitioned Delta tables, enabling scalable joins, aggregations, and incremental updates for model training.

Implemented automated retraining, validation, and deployment via SageMaker Pipelines, including A/B testing and model drift monitoring.

Delivered QuickSight dashboards accessed by 150+ clinicians, reducing time-to-insight from 2 days to under 30 minutes. Maintained 100% HIPAA compliance with auditable ML workflows, integrating explainability tools to support regulatory reviews.

American Express, Data Scientist/Data Engineer Aug 2020 – Jul 2023 India Project: Financial Risk Analytics & Transaction Surveillance

• Designed and delivered ETL pipelines in Azure Data Factory and IBM DataStage to handle multi-terabyte daily financial transactions from global payment systems, ensuring regulatory compliance and 99.9% data accuracy.

• Built and deployed fraud detection and AML risk scoring models using PySpark MLlib and scikit-learn, boosting anomaly detection rates by 28% while reducing false positives.

• Leveraged Azure Machine Learning for model training, hyperparameter tuning, and deployment across batch and real- time scoring workflows.

• Created feature engineering pipelines in dbt and PySpark to produce model-ready datasets, cutting preprocessing time by 35%.

• Tuned Snowflake performance with clustering, caching, and partitioning strategies, reducing query runtimes by 40% and lowering compute costs.

• Unified 7+ disparate data sources (including SWIFT messages, payment gateways, and trade systems) into a centralized Delta Lake, enabling consistent analytics and model training.

• Designed and launched Power BI dashboards that visualize real-time risk metrics and compliance KPIs, speeding up fraud investigations by 25%.

• Automated CI/CD workflows in Azure DevOps for ADF, dbt models, and ML deployments, ensuring consistent releases across dev, test, and prod environment

PROJECTS

Customer Retention Prediction using Databricks Pipelines & MLflow Developed a data processing pipeline in Azure Databricks using PySpark to ingest, clean, and aggregate simulated transactional and customer interaction data from Azure Data Lake, creating ML-ready datasets. Trained and tracked churn prediction models (Logistic Regression, XGBoost) with MLflow for experiment logging, hyperparameter tuning, and maintaining reproducible workflows. Utilized MLflow Model Registry for version control and deployed a batch inference process in Databricks Jobs to generate churn scores, validating performance through accuracy and ROC-AUC metrics. Fake News Detection Using Machine Learning & Deep Learning Applied NLP techniques (TF-IDF, word embeddings, transformers) for text preprocessing, feature extraction, and preparation of clean training datasets from multiple news sources. Trained and compared Logistic Regression, Random Forest, SVM, LSTM, and BERT models; optimized inference latency via parallel matrix computations, reducing prediction time from 120ms to 35ms per input. Evaluated models using precision, recall, F1-score, and ROC-AUC, selecting the most balanced and high-performing model for deployment.

SKILLS

Programming & Frameworks: Python, SQL, PySpark, Pandas, NumPy, TensorFlow, Transformers, Scikit-learn, Matplotlib Machine Learning & NLP: Regression, Classification, Clustering, Anomaly Detection, Time Series Forecasting, Deep Learning

(LSTM, GRU, Transformer-based models, BERT, BioBERT, RoBERTa), NLP, Model Optimization & Compression, Hyperparameter Tuning

Data Engineering & ETL: Apache Airflow, dbt, Azure Data Factory, AWS Glue, Kafka, Databricks, Delta Lake, Hadoop ecosystem Data Analytics & Modeling: Data cleaning, validation, KPI tracking, statistical analysis, A/B testing, time series forecasting, predictive modeling, exploratory data analysis

Data Visualization & BI Tools: Power BI, Tableau, Excel, Looker Studio Cloud Platforms & Databases: Azure Synapse, Azure Blob Storage, AWS S3, Redshift, Snowflake, MySQL, PostgreSQL, MongoDB

DevOps & CI/CD: GitHub Actions, Azure DevOps, Jenkins, Docker, Kubernetes Collaboration & Monitoring: JIRA, Confluence, Jupyter Notebook, Anaconda EDUCATION

Master of Science in Data Science Sep 2023 – May 2025 USA Pace University

Bachelor of Technology in Civil Engineering Jul 2017 – Jul 2021 Hyderabad, India Gokaraju Rangaraju Institute of Engineering and Technology CERTIFICATIONS

AWS Certified Machine Learning Specialty

AWS Certified Data Engineer Associate

Databricks Data Engineer Associate - Databricks

DP-700: Microsoft Fabric Data Engineer

Contact this candidate