Post Job Free
Sign in

Machine Learning Data Science

Location:
Washington, DC
Posted:
February 25, 2025

Contact this candidate

Resume:

Ujjawal Dwivedi

Washington DC, ***** 202-***-**** Email LinkedIn GitHub

Education

The George Washington University, Washington, DC

Master of Science, Data Science December 2025

Relevant Coursework: Natural Language Processing, Time Series Analysis and Modeling, Deep Learning, Machine Learning

University of Petroleum and Energy Studies, Dehradun, India

Bachelor of Technology, Electronics IOT June 2020

Publications: Image Classifier with Convolutional Neural Network

Relevant Coursework: Statistics, Linear Algebra, Cloud Computing, Data Structures and Algorithms with Python

Leadership

oUPES Football-Captain

oUPES IEEE Student Chapter Vice President

oUPES Rubik’s Cube Club President /Founder

Technical Skills & Certifications

Programming: Python, SQL, R, Java, Kotlin

Framework & Packages: Sci-kit Learn, PyTorch, Keras, TensorFlow, XGBoost, NLTK, Spacy, BeautifulSoup, Selenium, Apache Hadoop, Apache Spark, Pandas, Numpy, Seaborn, gglpot2, tidyverse, dplyr

Software & Tools: Jupyter Notebook, VScode, Pycharm, Rstudio, MySQL, MongoDB, Neo4j, Google Bigquery, Snowflake, Git/Github, Jira, Kubernetes, MLFlow, AWS EC2, AWS Sagemaker, AWS Lambda, AWS S3, AWS RDS, TensorFlow on GCP.

Certifications: Google GROW Data Analytics, Data Science Professional by IBM, Machine Learning by Deeplearning.AI, Applied Data Science by University of Michigan, Linear Algebra and Calculus by Imperial College of London.

Work Experience

George Washington University Washington, DC

Graduate Research Assistant Aug 2024- Jan 2025

Deployed LSTM and Random Forest models for sentiment classification, achieving 90% accuracy, and utilized NLP pipelines (tokenisation, lemmatisation, NER) for text preprocessing to ensure high-quality model inputs.

Utilized LLMs (e.g., BERT, RoBERTa) for advanced text representation and feature extraction and fine-tuned transformer-based models (e.g., BERT, DistilBERT) for sentiment classification, achieving state-of-the-art performance on benchmark datasets. GitHub

Infosys Limited Gurugram, India

Machine Learning Engineer Apr 2022 – Jan 2024

Engineered automated ML pipelines (XGBoost, TensorFlow) for anomaly detection in terabyte-scale test data, deploying via AWS Lambda with 98% precision; reduced manual analysis by 40% and accelerated defect resolution by 3x through real-time alerts.

Architected CI/CD pipelines (GitHub Actions, Docker, Kubernetes) for automated ML deployment, integrating with QA systems to reduce deployment cycles by 25% and ensure zero downtime during updates.

Preprocessed terabyte-scale datasets (scikit-learn, Apache Spark) with Airflow-orchestrated workflows, ensuring SOC2 compliance via AWS KMS encryption and role-based access; enabled 98% data integrity for downstream model training.

Colt Technologies Gurugram, India

Data Analyst(ML & Predictive Analytics) Oct 2020 – Nov 2021

Engineered scalable ETL pipelines (Apache Spark, HDFS, SQL) to automate ingestion/preprocessing of 10M+ records, slashing latency by 35%; operationalized insights via real-time SQL dashboards driving 30% churn reduction in 6 months through targeted retention campaigns.

Developed and deployed a logistic regression model (scikit-learn) with feature engineering and 5-fold cross-validation, achieving 95% accuracy in predicting churn; insights drove personalized retention campaigns, boosting customer retention by 26% and saving $1.2M annually in recovered revenue.

Bharti Airtel Gurugram, India

Data Analytics Intern May2019 – Jul 2019

Designed telecom network dashboards visualizing CDR, latency, and packet loss metrics, optimizing capacity and reducing service downtime by 20%.

Leveraged Wireshark to analyze telecom traffic, identifying SIP/DDoS vulnerabilities and mitigating risks with security teams, boosting compliance and reliability.

Technical and Research Project Experience

Brain tumor Segmentation (Graduate research project) GitHub

Developed and optimized deep learning models (3D U-Net and Residual U-Net) for segmenting multi-compartment gliomas from post-treatment MRI scans, achieving high Dice scores in critical regions like surrounding FLAIR hyperintensities (0.797) and resection cavities (0.61).

Implemented advanced preprocessing and augmentation techniques, including Z-score normalisation, multi-modal MRI stacking, and transformations like zooming and rotation, enhancing model generalisation and segmentation accuracy in complex medical imaging tasks.

Recommendation system using Spotify data with Machine learning and Neo4j (Graduate Research Project)GitHub

Used Neo4j to create a graph database and mapped the relationships between users, songs, genres, and other musical attributes.

Applied KNN on content-based filtering and collaborative-based filtering. Implemented evaluation metrics to assess the performance of the recommendation system and fine-tuned the KNN models for optimal accuracy and efficiency.

Unbiased Classification of Spatial Strategies in Barnes Maze (Graduate Research Project)

Developed and implemented machine learning models, including KNN, SVM, Decision Trees, CNNs, and reinforcement learning, to classify spatial strategies in the Barnes maze and uncover complex behavioural patterns

Urban Mobility and Traffic Flow Forecasting Using Advanced Time Series Models (Graduate research project)

Designed ARIMA/SARIMA/Holt-Winters models and engineered PCA/SVD pipelines to forecast traffic flow/pollution trends from geospatial datasets (1M+ GPS points), achieving 92% accuracy in short-term predictions and 30% model efficiency gain—techniques scalable to battery sensor data for anomaly detection.



Contact this candidate