Data Scientist

Location:

Eden Prairie, MN, 55344

Posted:

May 26, 2026

Contact this candidate

Resume:

UMAMAHESWARARAO MAMILLAPALLI

AIML Data Scientist

Email: ***************.***@*******.***

Phone: 763-***-****

PROFESSIONAL SUMMARY

Results-driven Data Scientist with 12+ years of experience in Data products including 8+ years of experience in Data scientist role with Machine Learning algorithms, Predictive Analytics, Statistical Modeling, Natural Language Processing (NLP), and AI algorithms. And having 4+ years of experience in Data Analyst role. Adept at extracting actionable insights from complex datasets and building data-driven solutions that drive strategic decision-making. Passionate about leveraging advanced analytical techniques and cutting-edge tools to solve real-world problems and optimize business outcomes across various domains.

●Passionately involved in entire Data Science Project life cycle from data extraction to Machine learning model evaluation and Storytelling.

●Wrangled unstructured/structured data to help show senior management that more optimal and faster decisions can be made with the right data

●Strong experience in data science, machine learning and artificial intelligence using different methodologies like Regression, Bayesian, Decision Trees, Random Forests, SVM, Kernel SVM, Naïve Bayes, K-means Clustering, Natural Language processing (NLP) among others

●Scientific thinking and ability to invent, a track record of thought leadership and contributions

●Strong knowledge in applying big data/advanced analytics to identify and exploit data with positive business impact

●Proven track record of diving into the data to discover hidden patterns using predictive analysis

●Self-starter and able to work interactively and independently with stakeholders. Expertise in managing multiple projects and teams with an excellent track record.

●Experience with SQL queries to perform data analysis, data mapping and data validation of the transformed data output

●I leveraged packages like Sklearn, Keras 2.0, TensorFlow, NLTK, SciPy, Deeplearning4j in Python in developing and evaluating Machine learning, Deep Learning and NLP Models for Problem solving.

●Hands on Experience with Data Manipulation packages like Numpy, Pandas, SQLAlchemy and Data Visualizing packages like Matplotlib, Seaborn and Bokeh.

●Experienced in SQL programming and creation of relational and Non- Relational Data Bases

●Worked on Statistical models to create new theories and products. I employed Spotfire, Tableau to create dashboards and visualizations.

●Trained and Deployed Cloud Machine Learning Models on Google Cloud ML engine using TensorFlow

●Designed and implemented supervised algorithms like Logistics Regression, Decision trees, XGboost, SVM’s, Polynomial Regression and Unsupervised Machine Learning algorithms like clustering. K-means. Mixture models. Hierarchical Clustering, Anomaly Detection.

●Worked with clients to identify analytical needs and documented them for further use. Identified problems and provided solutions to business problems using data processing, data visualization and graphical data analysis.

●Mentored junior data scientists and ML engineers on best practices in AI solution design, responsible AI use, and reproducibility, fostering a collaborative and innovative team environment.

●Solid knowledge of mathematics and experience in applying it to technical and research fields. Identifying areas where optimization can be efficient.

●Compiled Statistical methodologies, Statistics methodologies like A/B testing, Hypothesis testing, Statistical inference, Parameter estimation on historical data to make decisions while solving the Business problems.

●High end knowledge on Big Data technologies like Spark SQL, PySpark, Hive, Scoop, Flume, Ambari Console.

●Developed predictive models using Python to predict customers churn and classification of customers.

●Leveraged Image Processing techniques for object recognition using Deep Learning techniques.

●Query optimization, execution plan and Performance tuning of queries for better performance in SQL.

●Worked on Shiny and R application showcasing machine learning for improving the forecast of business.

●Hands on experience with version control systems like GIT, Github

TECHNICAL EXPRETISE

Programming skills

Python (SciPy, NumPy, Pandas, Scikit-learn, TensorFlow, Flask, Matplotlib,

Bokeh, Jupyter notebook), C, R, Shell Scripting, SQL

Machine Learning

Supervised learning, Unsupervised learning, Reinforcement learning,

Neural networks and Deep Learning, Markov Models

Math and Statistics

Inferential Statistics, Differential Equations, Intermediate Probability,

Stochastic Calculus

Tools and Technologies

MATLAB, Minitab, Microsoft Office Suite, Eclipse, R studio, Tableau, Spotfire,

Hadoop, HDFS, Apache Spark and Hive, Scoop, ETL tools, QlikView

Data Bases

SQL Server, Oracle, MySQL, NoSQL, HDFS, VLDB

Visualization

Tableau, matplotlib, seaborn, Power BI

Cloud computing

AWS (EC2, S3, DynaboDB, Redshift, Aurora, VPC, SQS), Microsoft Azure

Databases

Oracle, MS Access, MS SQL Server

ACADEMIC QUALIFICATION:

●St Cloud State University - Master of Engineering Management

PROFESSIONAL EXPERIENCE

Client: Medtronic – Mounds View, MN

Role: AI/ML Data Scientist June 2023 – Present

Key responsibilities:

●Participated in Status meetings with the clients, conducting reviews of the deliverable's and defect tracking.

●Designed, developed, and fine-tuned machine learning and deep learning models—including transformer-based LLMs like GPT and BERT—for predictive analytics, automation, and AI-powered decision-making across enterprise-scale applications.

●Engineered robust pipelines for multi-modal data processing, integrating structured SQL datasets, unstructured text, and time series inputs to boost model performance and enable dynamic business decision support.

●Applied advanced NLP techniques including embedding-based retrieval, prompt engineering, and fine-tuning of LLMs to power solutions for document classification, customer feedback analysis, multi-turn reasoning, and sentiment detection.

●Collaborated cross-functionally to convert business needs into AI use cases, delivering interpretable models and automated insights that directly impacted revenue growth, retention strategies, and operational efficiencies.

●Deployed scalable AI/ML solutions using CI/CD pipelines built with GitHub Actions, Jenkins, and Bitbucket, and containerized services using Docker and Kubernetes for production reliability and fault tolerance.

●Leveraged AWS (SageMaker, ECS, S3), Azure, and Snowflake for model training, data storage, and MLOps orchestration, enabling distributed and accelerated model training with PyTorch DDP and TensorFlow.

●Utilized MLFlow for model tracking and API deployment; implemented Kubernetes autoscaling and service mesh frameworks to ensure high-availability model serving and inference optimization.

●Performed extensive feature engineering and exploratory data analysis (EDA) using Python (Pandas, NumPy, Scikit-learn) and SQL to uncover trends and build predictive models for customer segmentation, churn, and recommendation engines.

●Built real-time monitoring and alerting solutions using Prometheus, CloudWatch, ELK Stack, and Grafana to maintain SLA adherence and ensure ongoing model accuracy and system uptime.

●Created interactive dashboards and model insight visualizations using Tableau, Streamlit, and Dash to communicate findings clearly to stakeholders, driving informed business strategies.

●Delivered retrieval-augmented generation (RAG) systems and agent-based AI solutions using LangChain and custom prompt chaining to enhance LLM context awareness and personalization.

●Participated in enterprise-wide architecture and data governance review boards to ensure AI system compliance, scalability, and integration with existing data warehouse ecosystems.

Client: Emplify health – La Crosse, WI

Role: AI/ML Data Scientist July 2021 to May 2023

Responsibilities:

•Working closely with marketing team to deliver actionable insights from huge volume of data, coming from different marketing campaigns and customer interaction matrices such as web portal usage, email campaign responses, public site interaction, and other customer specific parameters.

•Fine-tuned Large Language Models (LLMs) and developed recommendation engines using embedding-based retrieval and personalization strategies to enhance upsell and cross-sell performance based on historical renewal data.

•Leveraged prompt engineering and transfer learning to adapt pre-trained multi-modal models to customer data, achieving improved performance across customer insight tasks including Net Promoter Score (NPS) analysis and behavior prediction.

•Applied advanced ML techniques such as clustering, linear regression, and decision trees to extract actionable insights from complex business data, enabling revenue optimization for vendors and service improvements.

•Developed multi-GPU training pipelines using PyTorch and TensorFlow, scaling deep learning experiments for high-dimensional and sparse customer datasets to enhance training efficiency and model generalization.

•Performed EDA in Jupyter Notebooks and built interactive Tableau dashboards to translate model insights into executive decision-making tools, directly influencing key product and service metrics.

•Applied real-time outlier detection on streaming customer data using historical patterns, reducing noise and improving ML system reliability in production environments.

•Architected and deployed end-to-end ML workflows on Azure Machine Learning and Microsoft Azure Kubernetes Service (AKS), integrating Docker, MLFlow, and model registries to support MLOps lifecycle from experimentation to inference.

•Developed robust data ingestion and transformation pipelines using Python and SQL, including advanced joins, subqueries, and data mapping documentation to streamline ETL across disparate sources.

•Built and deployed multi-step predictive pipelines for contact scoring and customer classification, leveraging ensemble modeling approaches for robustness and accuracy.

•Created and maintained recommendation systems that identify patterns across products and user segments, incorporating both behavioral and historical transactional data for dynamic upsell suggestions.

Client: AT&T, Dallas, TX.

Role: AI/ML Data Scientist Oct 2019 to June 2021

Responsibilities:

•As one of the youngest employees in the company to be promoted to the role of team lead, I led teams to gather business requirements, build BI applications to provide insights and provide business solutions based on discoveries made.

•Designed and deployed deep learning models using PyTorch and TensorFlow, contributing to multi-modal systems that process structured and unstructured data, including tabular, video, and text inputs.

•Fine-tuned Large Language Models (LLMs) and Multimodal LLMs (MLLMs) leveraging transfer learning, embedding-based retrieval, and prompt engineering techniques to adapt models to specialized business domains.

•Scaled training and inference pipelines across multi-GPU clusters and distributed compute environments, optimizing resource utilization and reducing model convergence time.

•Integrated MLOps best practices by developing containerized model training and deployment workflows using Docker, Kubernetes, MLFlow, and TensorFlow Serving, ensuring reproducibility and CI/CD for ML applications.

•Operationalized Python-based data pipelines to ingest and process large-scale datasets from PostgreSQL, Oracle, and external APIs, automating outputs to AWS S3 for downstream inference and storage.

•Built real-time data extraction and transformation applications, enabling ingestion of high-volume streaming data from platforms such as LYNX, IHS, and SFDC to support video and telemetry-based models.

•Developed high-performance views and SQL scripts for preprocessing and transforming multimodal inputs, ensuring alignment with model expectations for training and inference.

•Led the design and troubleshooting of data pipelines supporting high-throughput video analytics and real-time inference, aligning closely with AI-driven product roadmaps.

•Collaborated with AI researchers and engineers to build and deploy retrieval-augmented generation (RAG) pipelines using vector embeddings and attention mechanisms, boosting LLM response relevance.

•Presented AI-driven solutions to business stakeholders, translating research outputs into product features, and contributing to the organization’s AI/ML IP strategy.

Client: Nielsen, Columbia, MD.

Role: Data Analyst June 2016 to Sep 2019

Responsibilities:

•Exposed to various phases of Software Development Life Cycle using Agile - Scrum Software development methodology.

•Used python libraries and MySQL queries/subqueries to create several datasets which produced statistics, tables.

•Used Python to write data into JSON files for testing Django Websites. Created scripts for data modelling and data import and export.

•Worked on CI/CD tool Jenkins to automate the build process from version control tool into testing and production environment, Managed build results in Jenkins and deployed using workflows.

•Deployed the project into Jenkins using GIT version control system.

•Implemented business logic using Python 2.7. Worked with HR team to reduce attrition rate.

•Data Model Creation based on requirements using python and basics of excel.

•Created a Git repository and added the project to GitHub.

•ETL (Extract, Transform and Load) data using Informatica integrated with SQL and Oracle Database.

•Collect, transform, move, store, aggregate and optimize data using Informatica – ETL.

•Optimized Informatica mappings by filtering the data at source and time taken to run by 30%.

•Created Cron jobs to trigger Informatica Jobs that trigger Shell scripts on old servers.

•Worked on Python OpenStack APIs and used NumPy for Numerical analysis.

•Deployed AWS Lambda functions that use Dynamo DB and KMS keys for encryption/decryption.

•Install and configuring monitoring scripts for AWS EC2 instances to improve cost of maintaining infra.

•Managed large datasets using Pandas data frames and MySQL.

•Generated Tableau dashboards using complex SQL query data from multiple databases across datastores

•Handled the client-side validation using JavaScript.

Client: Calix, California.

Role: SQL Developer / ETL Developer. Jan 2014 to May 2016

Responsibilities:

•Implemented Complex business logic with User-Defined-Functions, Index Views and also created User defined Data type, Clustered & Non-clustered Indexes

•Installed and configure servers and clients using SQL Server 2008, Upgraded SQL databases

•Created complex Stored Procedures, Triggers, Functions (UDF), Indexes, Tables, Views and other T-SQL code and SQL joins for applications following SQL code standards

•Managed indexes, statistics and tuned queries by using execution plan for optimizing the performance of the databases

•Query optimizations (T SQL or Query Analyzer) using query analyzer and Index tuning wizards

•Knowledge of Database Performance monitoring tools

•Created packages in SSIS with error handling and worked with different methods of logging in SSIS

•Created ETL SSIS packages both design and code to process data to target databases

•Performed database transfers and queries tune-up, integrity verification, data cleansing, analysis and interpretation. Developed, monitored and deployed SSIS packages

•Maintained the physical database by monitoring and optimizing performance, data integrity and SQL queries for maximum efficiency using SQL Profiler

•Monitor SQL Error Logs /Schedule Tasks/database activity/eliminate Blocking & deadlocks user counts and connections/locks etc.

•Implement automated Index Maintenance strategy run as SQL Job on a daily basis.

•Resolving any dead locks issues with the Databases/Servers on a real-time basis.

•Responsible for the design of the SSIS package for updating the data mart from OLTP database and validation of the aggregate data from the ODS and the OLTP Database

Contact this candidate