Data Scientist

Location:

Erandwane, Maharashtra, India

Posted:

August 14, 2025

Contact this candidate

Resume:

Pandurang Jadhav – Data Scientist

+91-845******* **********@*****.*** Website Linkedin Github Medium Professional Overview:

I am a Data Scientist with 2.5+ years of experience in building AI-driven solutions, developing predictive models, and optimizing data workflows. Skilled in data analysis, deep learning, and automation, I specialize in designing and deploying end-to-end machine learning models using frameworks like TensorFlow, PyTorch, and Scikit-Learn. My expertise includes NLP, Computer Vision, and predictive analytics, along with big data processing using Spark and SQL-based databases. I have hands-on experience in automating data pipelines, web scraping, and deploying AI models using FastAPI, Docker, and cloud platforms like AWS and Azure. With a strong analytical mindset and problem- solving abilities, I am passionate about leveraging AI and ML to drive innovation, enhance decision-making, and improve business processes. Additionally, I excel in communicating technical insights to non-technical stakeholders, ensuring seamless collaboration between data teams, engineers, and business leaders. Additionally, I excel in leading and collaborating with teams across multiple domains, ensuring seamless integration between data teams, engineers, and business leaders. Key Skills:

• Programming & ML Libraries: Python, R, Scikit-Learn, PyTorch, TensorFlow, OpenCV, NumPy, Pandas, Dask, Matplotlib, Seaborn, Plotly, Keras, ONNX, XGBoost, LightGBM, Gradient Boosting, AutoML, MLFlow, TensorFlow Serving, Hugging Face Transformers, GPT, BERT, GAN, RAN, ANN, CNN, RNN, LSTM, GRU.

• Machine Learning & AI Algorithms: Regression (Linear, Logistic), Classification, Clustering, Anomaly Detection, Time Series Analysis, Recommender Systems, Decision Trees, Random Forest, Gradient Boosting, Bayesian Models, Auto Encoders, Reinforcement Learning, Active Learning, Transfer Learning, Graphical Models, Gaussian Processes, Neural Models, Attribute Engineering.

• Database & Data Processing: MySQL, PostgreSQL, MSSQL, MongoDB, Graph DBs.

• Development & Automation: FastAPI, Flask, Django, REST APIs, Apache Airflow, Luigi, Talend, Apache NiFi, Informatica, Jenkins, Git, GitLab CI, CI/CD, Kubernetes, Docker, DevOps.

• Statistical & Analytical Methods: Probability & Statistics, Linear Algebra, Hypothesis Testing, Statistical Models, Forecasting, t-SNE, KNN, PCA, SVM Kernel, Word2Vec, Bagging, Statistical Analysis, Data Munging, Data Visualization (Tableau, Power BI, Matplotlib, Seaborn, Plotly).

• Tools & Platforms: Jupyter, Google Analytics, SAS, Redshift, Snowflake, Apache Spark, Synapse, Oracle, Prophet, ML Inferencing, AI/ML Conferences. Professional Experience:

Tech Mahindra BPS ML DevOps Manager April 2025 to Present Client: ZS Associates (for McKesson)

• Oversee 20+ ML pipelines across inventory and pricing systems, ensuring high availability, reliability, and performance.

• Manage deployment, monitoring, and optimization of ML workflows on Databricks and Linux-based servers.

• Lead transition of ML models and pipelines from development teams to production, ensuring seamless stakeholder delivery and communication.

• Act as the primary liaison for stakeholders, delivering performance insights and updates on ML pipeline outcomes for business decision-making.

• Design and manage ML data pipelines, integrating with Snowflake, Azure Data Factory, and Data Lake for consistent, scalable data flow.

• Automate repetitive ML data engineering tasks and scheduling scripts, improving operational efficiency and reducing manual efforts by 40%.

• Maintain and enhance data scraping systems critical for upstream ML data ingestion and model accuracy.

• Monitor and troubleshoot production ML workflows, offering real-time support to ML developers and data scientists.

• Lead a team of ML engineers, ensuring adherence to best practices in ML Ops, documentation, and continuous delivery. Britsure Insurtech Private Limited Business Analyst September 2022 to January 2025 2 Years 5 Months

• Worked on Prober, a web-based investigation system for insurance claims powered by automation. Integrated Python-based automation and web scraping tools

(BeautifulSoup, Selenium), and a scalable CRM (Angular, ASP.NET), reducing manual tasks.

• Contributed to Intask, an unsupervised algorithm for optimizing field officer allocation. Used predictive modeling to improve officer selection efficiency by 30%, reducing manual allocation by 40%, and significantly improving investigation turnaround time.

• Designed a lightweight ML model that reduced the time required to generate manual investigation triggers by 80%, making the risk assessment process more data-driven.

• Led the automation of data entry workflows using web scraping techniques, theoretically reducing human effort by 75% in a team of 12. Built custom ETL pipelines for data extraction, transformation, and storage in MySQL, streamlining processing.

• Analyzed and improved existing business workflows, making operations more efficient and user-friendly.

• Oversaw server architecture, security protocols, and project management activities.

• Developed strong skills in documentation, presentations, client communication, and problem-solving.

• Focused on operational excellence by continuously analyzing in-house, field, and back-office team workflows along with management processes. Experimental Work (Side Projects) :

Case studies & Articles:

• From Code to Intelligence – A Business Analyst’s Journey into AI (10 Part Series).

• Inside eCommerce: From Click to Delivery (11 part Series)

• More..

Job Description Generation using T5 (LLM Fine-Tuning + Data Engineering)

• Tech Used: Python, Hugging Face Transformers, T5, Pandas, FastAPI, JSONL

• Process: Collected and cleaned 15L+ job records Engineered input features (title, skills, experience, location, role) Fine-tuned T5 model for structured JD generation Built evaluation framework to compare output vs. human-written JDs Exposed generation pipeline via FastAPI for integration and testing.

Job Portal Ranking Analysis (Web Scraping + Llama 3.1)

• Tech Used: Python, Selenium, BeautifulSoup, Llama 3.1, Matplotlib, Pandas

• Process: Scraped 200+ job listings across Naukri, LinkedIn, Hirist, and Foundit Extracted ranking factors (skills, experience, location) Analyzed platform-specific biases using Llama 3.1 Visualized insights through data analytics. AI-Powered Job Search Automation (Selenium + DeepSeek AI)

• Tech Used: Python, Selenium, DeepSeek AI, BeautifulSoup, Naukri API

• Process: Scraped job listings Extracted key skills from JDs Matched with resume using DeepSeek AI Assigned relevance scores Automated job applications via official company websites.

Bird Species Classification (CNN + Transfer Learning)

• Tech Used: Python, TensorFlow, EfficientNetB3, OpenCV, Streamlit, FastAPI

• Process: Collected bird images Preprocessed & augmented data Fine-tuned EfficientNetB3 Achieved 98.5% accuracy Deployed as a Streamlit web app via FastAPI.

AI Chatbot for Cotton Crop Data (NLP + Distillation)

• Tech Used: Python, Hugging Face Transformers, FastAPI, SQLite

• Process: Collected & distilled cotton farming data Fine-tuned Transformer model Built chatbot API Optimized responses using domain-specific fine-tuning.

House Price Prediction (Logistic Regression)

• Tech Used: Python, Scikit-Learn, Pandas, Flask, AWS

• Process: Collected real estate data Engineered features (location, market trends) Trained logistic regression model Deployed REST API for real- time price predictions.

Cat vs. Dog Classification (CNN + Transfer Learning)

• Tech Used: Python, TensorFlow/Keras, OpenCV, FastAPI, Streamlit

• Process: Collected labeled cat/dog images Preprocessed & augmented data Trained CNN using Transfer Learning (ResNet, EfficientNet) Achieved high accuracy Deployed as a web app using FastAPI + Streamlit. Chip Manufacturers vs. AI Companies Stock Change & Market Cap Analysis

• Tech Used: Python, Pandas, NumPy, Matplotlib, yFinance, Plotly

• Process: Scraped stock data of chipmakers & AI firms Analyzed market cap trends & price fluctuations Compared performance using statistical metrics Visualized insights with interactive charts. IMDB Movie Recommendation System (Collaborative Filtering + NLP)

• Tech Used: Python, Scikit-Learn, TF-IDF, NLTK, Flask

• Process: Collected IMDB movie data Extracted features (genre, ratings, reviews) Built collaborative filtering + NLP-based model Generated personalized movie recommendations.

Animal Classification using YOLO (Object Detection + Deep Learning)

• Tech Used: Python, YOLOv8, OpenCV, PyTorch

• Process: Collected multi-species animal images Labeled & annotated dataset Trained YOLOv8 for real-time detection Deployed model for live object recognition.

Academic Background:

• PGDM in Artificial Intelligence Applied Roots & University Of Hyderabad, Hyderabad June 2023

• BE in Mechanical Engineering MGM's Jawaharlal Nehru Engineering College Aurangabad 62% December 2020

• 12th Computer Science Nutan Mahavidyalaya Selu 72% 2014

• 10th Nutan Vidyalaya Selu 84% 2012

Certifications & Leadership:

• CutShort [Advanced Data Structures, Advanced Algorithms, Advanced Data Analytics, Python]

• Led and collaborated with three teams (Web Dev: Angular & ASP.NET, Flutter, Python)

• Experience with Geospatial Data (Intask & Infield projects)

Contact this candidate