Post Job Free
Sign in

Machine Learning Data Scientist

Location:
Fullerton, CA
Posted:
June 06, 2025

Contact this candidate

Resume:

Monisha Monali G R

Fullerton, CA 619-***-**** ************@*****.*** linkedin.com/in/monaligr/ PROFILE

• Data Scientist with 2 years of hands-on experience in building and deploying scalable machine learning models, big data pipelines, and cloud-native AI solutions.

• Skilled in Python, PySpark, SQL, and deep learning frameworks like TensorFlow and PyTorch. Proven track record of improving predictive accuracy, automating workflows, and delivering real-time analytics using AWS, Apache Spark, and Hadoop. Strong background in GenAI, LLMs, NLP, and time-series forecasting.

• Adept at collaborating with cross-functional teams to drive data-driven decision-making and streamline end-to- end MLOps pipelines in enterprise environments.

SKILLS

Languages: C, C++, Python, Java, R, Fortran, Perl

Frameworks/Libraries: PySpark, Apache Spark, Apache Hadoop, Databricks, PyTorch, LangChain, RAG Systems/Tools: Linux, OpenMP, Pthreads, MPI, CUDA, NVIDIA profiling tools, SQL, MySQL, NoSQL, MATLAB AI/ML: Gen AI, LLMs, NLP (BERT, GPT, LLaMA)

Programming & Scripting: Python (Pandas, NumPy, Scikit-learn, TensorFlow, Keras, PyTorch), SQL Machine Learning & AI: LSTM, Reinforcement Learning (DQN), SVM, Random Forest, XGBoost, Gradient Boosting, A/B Testing, NLP

Big Data & ETL: PySpark, Hadoop, Google BigQuery, Apache Beam Cloud & Deployment: AWS (Lambda, SageMaker, API Gateway, Cloud9, CodePipeline, CodeBuild), GCP Data Visualization & BI: Tableau, Matplotlib, Seaborn DevOps & MLOps: CI/CD Pipelines, Model Optimization, Hyperparameter Tuning Tools & APIs: Yahoo Finance API, LangChain, AstraDB, Gemini API, Flask CERTIFICATIONS

• Generative AI Engineer Certification

Built and deployed AI-driven NLP applications (e.g., AI agents, chatbots) using tokenization, embedding models, and transformers; Capstone: Question-Answering Bot with Gradio, RAG, and Watsonx embedding.

• National Student Data Corps (NSDC) Certificate

Completed” Explorer Transportation Data Science Project” with skills in Data Cleaning, Python, Time Series Analysis, Geospatial Analysis, and Data Visualization. PROFESSIONAL EXPERIENCE

Data Scientist Feb 2025 – Present

Comcast USA

• Led the development of machine learning algorithms to enhance customer retention and optimize product recommendations, contributing to significant business growth and customer engagement.

• Designed and deployed a deep learning-based LSTM model in TensorFlow/Keras for time-series forecasting, leveraging historical stock data to improve prediction accuracy by 12%

• Applied quantitative research and analytical thinking to validate data models, ensuring high performance in data analytics pipelines.

• Designed scalable data architecture and ETL pipelines for large datasets in AWS Redshift and Spark, improving query speed by 25%.

• Utilized PySpark and Apache Spark to build data pipelines for efficient data processing, enabling real-time analytics across multiple business functions.

• Delivered insights using Power BI, enabling better status reporting and data-driven decision-making for the executive team.

• Collaborated with engineers and stakeholders to define the scope and vision of data projects, ensuring alignment with business goals.

Data Scientist Sep 2023 – Aug 2024

Virtusa Consulting USA

• Analyzed large datasets from MS SQL Server and MongoDB using SQL and Python. Provided insights into financial performance, supporting budgeting and forecasting activities with detailed analytical reports.

• Applied machine learning algorithms (ARIMA, K-means Clustering) to forecast sales and optimize inventory management. Enhanced inventory accuracy and reduced holding costs by 15%.

• Utilized Apache Spark/PySpark and Hadoop for processing and analyzing large-scale financial data. Implemented data pipelines using Apache Airflow and NiFi for efficient data ingestion and processing.

• Developed and maintained interactive dashboards and reports using Tableau and Qlik Sense. Presented key financial metrics and trends to stakeholders, facilitating data-driven decision-making.

• Deployed data analytics solutions on AWS and Azure, leveraging Docker and Kubernetes for containerization and orchestration. Implemented CI/CD pipelines with Jenkins for continuous integration and deployment.

• Collaborated with finance and IT teams to ensure data accuracy and integration. Utilized tools like JIRA and GitHub for project management and version control

Data Scientist Jan 2022 – Aug 2022

Cognizant India

• Collaborated with cross-functional teams in Agile sprints, conducting requirement gathering sessions to align the customer churn prediction model with business needs and stakeholder expectations for impactful insights.

• Developed and fine-tuned a customer churn prediction model using Random Forest, XGBoost, and Logistic Regression, achieving 92% accuracy through advanced hyperparameter tuning and feature engineering techniques.

• Conducted extensive data preprocessing on 50,000+ customer records, handling missing values, outlier detection, and encoding categorical variables using Pandas, NumPy, and Scikit-learn for optimal model training.

• Evaluated models using metrics like ROC-AUC, Precision-Recall, and performed 10-fold cross-validation, improving recall by 15% through iterative testing and hyperparameter optimization using GridSearchCV.

• Deployed the model using Amazon SageMaker and AWS Lambda, integrating real-time API endpoints via API Gateway and ensuring scalable production-ready solutions with secure deployment pipelines.

• Automated CI/CD pipelines with AWS CodePipeline and CodeBuild, incorporating real-time model performance monitoring and automated updates for efficient production management. PROJECTS

• Dt-core System Enhancement: Optimized CUDA-based dt-core system for Duckiebot, im- proving AI inference and image processing.

• Matrix Multiplication Optimization: Analyzed and optimized techniques using MPI and OpenMP.

• 1D-Wave Equation Simulation: Modeled string dynamics with finite difference methods.

• Shooting Methods Implementation: Applied numerical methods for physics-based problems.

• Linear Least Squares: Developed a Fortran program for sparse linear least squares solutions. EDUCATION

M.S. Computational Science (Data Science) Dec 2024 San Diego State University San Diego, CA

B.S. Computer Science 2022

GITAM Deemed University India



Contact this candidate