Data Scientist

Location:

Vadodara, Gujarat, India

Posted:

May 29, 2026

Contact this candidate

Original resume on Jobvertise

Resume:

Prakash Mangireddygari Data Scientist / ML Engineer

USA +1-551-***-**** Email: *******@***********.*** LinkedIn GitHub Portfolio

Summary

Data Scientist / Machine Learning Engineer with 4+ years of experience supporting end-to-end data science initiatives across enterprise IT, healthcare,

and digital transformation environments. Strong foundation in Python, SQL, machine learning, deep learning, and statistical analysis, with hands-on

experience assisting in predictive modeling, NLP workflows, and scalable analytics solutions. Skilled in exploratory data analysis, model evaluation, and

translating business requirements into data-driven insights under guidance of senior teams. Experienced working in Agile environments, collaborating

with crossfunctional stakeholders, and contributing to deployment-ready solutions that improve operational efficiency and decision-making accuracy.

Technical Skill

Programming & Scripting: Python (Pandas, NumPy, SciPy), SQL (Advanced Queries, CTEs, Window Functions), R, Bash

AI / Machine Learning: Machine Learning, Predictive Analytics, Statistical Modeling, Regression, Classification, Clustering, Random Forest, XGBoost, Gradient Boosting,

SVM, KNN, Natural Language Processing (NLP), Text Analytics, Generative AI, Large Language Models (LLMs), Retrieval-Augmented Generation (RAG), Prompt Engineering,

Model Evaluation.

AI/LLM: RAG, LLMs, LangChain, OpenAI API, Vector Databases (FAISS/Pinecone/Chroma), Prompt Engineering, NLP, Transformers, Hugging Face, AI Agents

Deep Learning & NLP: TensorFlow, Keras, PyTorch, CNN, RNN, LSTM, Transformer-based Models, BERT, spaCy, NLTK, TF-IDF, Word Embeddings

Data Engineering & Pipelines: ETL Development, Automation, Workflow Orchestration, Apache Spark, PySpark, Data Cleaning, Feature Engineering, Data Validation, Data

Quality Checks, CI/CD Pipelines

Cloud & Big Data Platforms: AWS (S3, EC2, Lambda, SageMaker), Azure (Data Factory, ML Studio), Snowflake, Amazon Redshift, Hadoop (HDFS), Distributed Data Processing

Databases & Warehousing: PostgreSQL, MySQL, MongoDB (NoSQL)

Statistical Analysis & Experimentation: Hypothesis Testing, A/B Testing, Probability Models, Time Series Forecasting

Data Visualization: Tableau, Power BI, Matplotlib, Seaborn, Executive Dashboard Development

MLOps & Deployment: Docker, Kubernetes, MLflow, Model Monitoring, Cloud Deployment

Project & Delivery Methodologies: Agile / Scrum, Jira, SDLC, Git, Documentation, Stakeholder Communication

Professional Experience

Data Scientist / ML Engineer DXC Technology USA Sep 2025 Current

Implemented the design and development of end-to-end machine learning pipelines using Python, SQL, and PySpark to process large-scale enterprise data, improving

model performance and scalability.

Built and optimized machine learning, RAG, and LLM-powered solutions using XGBoost, LangChain, OpenAI APIs, and vector databases (FAISS/Pinecone) for predictive

analytics, intelligent search, and enterprise knowledge retrieval.

Built and optimized supervised and unsupervised models (XGBoost, Random Forest, clustering) for customer analytics and forecasting, driving measurable improvements

in business KPIs.

Architected and deployed NLP solutions using spaCy, transformer-based models, and custom pipelines to extract insights from unstructured data, significantly reducing

manual processing efforts.

Designed scalable data engineering workflows (ETL) leveraging PySpark, AWS S3, and Snowflake, ensuring high data quality, reliability, and accessibility for analytics and

ML use cases. Conducted advanced statistical analysis and experimentation (A/B testing) to validate model impact and guide data-driven product and business decisions.

Partnered with cross-functional stakeholders (product, engineering, business teams) to define requirements, translate them into ML solutions, and deliver production-

ready systems in Agile environments.

Led model deployment, monitoring, and lifecycle management using Docker and cloud platforms, implementing best practices for versioning, drift detection, and

performance optimization.

Developed and presented executive-level dashboards and insights using Tableau and Power BI, enabling stakeholders to track model performance and drive strategic

decisions.

Data Scientist Accenture India July 2020 June 2024

Developed predictive analytics models using Python, R, and SQL to support client-facing solutions across finance and enterprise IT domains, improving forecast accuracy

by 28% and reducing business risk through data-driven insights.

Developed RAG-based and LLM-powered NLP solutions using LangChain, transformer models, and vector databases to enable semantic search, intelligent document

retrieval, and automated question-answering across enterprise datasets.

Deployed scalable ML and AI workflows using Docker, AWS, and CI/CD pipelines, implementing model monitoring, version control, and performance optimization to ensure

reliable production operations.

Applied machine learning algorithms including logistic regression, decision trees, and gradient boosting to large transactional datasets, identifying hidden patterns and

improving customer segmentation strategies that increased targeted campaign effectiveness by 30%.

Designed and executed feature engineering strategies on high-volume datasets, improving model performance metrics such as precision, recall, and F1-score by up to 25%

across multiple projects.

Built automated data pipelines using Python and SQL to perform data extraction, cleansing, and transformation, reducing manual data preparation effort by 50% and

ensuring consistent data quality across analytics workflows.

Supported time series forecasting initiatives for demand and capacity planning, applying ARIMA and regression-based models that improved planning accuracy and reduced

forecast variance by 20%.

Documented data science workflows, model assumptions, and analytical findings, ensuring knowledge transfer, audit readiness, and long-term maintainability of analytics

solutions.

Education

Master s in Computer Science California State University San Bernardino, CA, USA Aug 2024 Dec 2025

Bachelor of Engineering Saveetha Institute of Medical and Technical Sciences - Tamil Nadu, India Aug 2018 July 2022

Contact this candidate