Machine Learning Engineer

Location:

Posted:

May 28, 2025

Resume:

OBJECTIVE

As a highly skilled Machine Learning Engineer with 6+ years of experience in designing, developing, and deploying machine learning models, my objective is to apply my expertise in data science, statistical analysis, and AI technologies to deliver innovative solutions.

PROFILE SUMMARY

• 6+ years of experience in designing, developing, and deploying machine learning models, artificial intelligence solutions, and advanced analytics for real-world applications and expertise in Python, R, Java, and Scala to develop scalable machine learning pipelines and data workflows for high-performance applications.

• Hands-on experience with frameworks and libraries like TensorFlow, PyTorch, Scikit-learn, Keras, OpenCV, and XGBoost for model building and optimization and Strong capabilities in data cleaning, preprocessing, feature engineering, and dimensionality reduction using Pandas, NumPy, and Spark to enhance data quality for analysis.

• Well-versed in Hadoop, Apache Spark, and Hive for distributed data processing and large-scale model training in big data environments and extensive experience working with cloud platforms including AWS (Sage Maker, Lambda), Google Cloud

(Vertex AI, Big Query), and Azure (Machine Learning Studio) for cloud-based machine learning deployments.

• Skilled in deploying machine learning models into production environments using Docker, Kubernetes, and REST APIs for seamless integration and expertise in MLOps practices and tools like Git, Jenkins, and CI/CD pipelines to automate workflows and ensure reproducibility of machine learning models.

• Strong experience in data visualization with tools such as Tableau, Power BI, and Matplotlib/Seaborn to present actionable insights to stakeholders and Proficient in building real-time streaming applications using Kafka, Flink, and Spark Streaming to process high-throughput data.

• Advanced knowledge in graph-based machine learning and network analysis using NetworkX and PyTorch Geometric for complex data structures and experienced in building and fine-tuning language models such as BERT, GPT, and XLNet for a wide range of NLP applications and expertise in edge AI deployment for resource-constrained environments with tools like TensorFlow Lite and ONNX Runtime for efficient model deployment on mobile and IoT devices.

• Proficient in integrating machine learning models into production systems using frameworks like Flask and FastAPI to develop scalable APIs for end-users and Proficient in time-series forecasting using ARIMA, SARIMA, and deep learning models such as LSTMs and GRUs to predict future trends in financial, healthcare, and sales data.

• Skilled in using natural language generation (NLG) libraries such as OpenAI GPT-3, T5, and BART for automating content generation and summarization tasks and adept at deploying models in production environments with Kubeflow, MLflow, and TensorFlow Serving for seamless model serving and management.

• Experience with knowledge graph construction and ontology modeling using Neo4j, GraphDB, and SPARQL to extract actionable insights from relational and non-relational data sources and expertise in building and deploying chatbots and virtual assistants using Dialog flow, Rasa, and Bot press for enhanced customer interaction and support.

• Skilled in working with big data tools like Apache Hadoop, Apache Flume, and Apache Kafka to stream, process, and analyze data in real-time and Familiar with autoML platforms like Google AutoML, H2O.ai, and TPOT to automate model selection, hyperparameter tuning, and feature engineering processes and experienced in using Snowflake, Redshift, and Azure Synapse Analytics for cloud-based data warehousing and ML-driven analytics pipelines. EDUCATION

East Texas A&M University, Masters in Business Analytics, USA from 2023 - 2025 Jayasree Vuppalapati

Machine Learning Engineer

469-***-**** *************@*****.***

TECHNICAL SKILLS

• Programming Languages: Python, R, Java, C++, Scala, SQL

• Machine Learning Frameworks: TensorFlow, Keras, PyTorch, Scikit-learn, XGBoost, LightGBM

• Deep Learning: CNN, RNN, GANs, BERT, GPT-3.

• Natural Language Processing (NLP): BERT, GPT, XLNet, spaCy, NLTK, Gensim, Text Blob, Word2Vec, Fast Text

• Data Preprocessing: Pandas, NumPy, Matplotlib, Seaborn, Scipy, OpenCV, Alteryx

• Big Data & Distributed Computing: Apache Hadoop, Apache Spark, Apache Kafka, Google Dataflow, Databricks

• Model Deployment & Integration: Docker, Kubernetes, Flask, FastAPI, TensorFlow Serving, ONNX Runtime

• Cloud Platforms: AWS (SageMaker, Lambda, EC2, S3), Google Cloud AI (Vertex AI, Big Query), Azure (Machine Learning, Databricks, OAS)

• Version Control & CI/CD: Git, GitHub, GitLab, Jenkins, Docker Compose, Kubernetes (CI/CD pipelines)

• ML Operations (MLOps): Kubeflow, MLflow, TensorFlow Extended (TFX), Data Robot

• Data Pipelines & ETL: Apache Airflow, Apache Nifi, AWS Glue, Talend, Informatica, Google Cloud Dataflow

• Computer Vision: OpenCV, YOLO, ResNet, Mask R-CNN, Object Detection, Semantic Segmentation

• Data Storage and Management: MySQL, PostgreSQL, MongoDB, Cassandra, HBase, NoSQL

• Parallel & Distributed Computing: Apache Spark, Dask, Horovod, TensorFlow Distributed

• Business Intelligence & Analytics: Tableau, PowerBI, KPI Development, LOD Expression, DAX, Advanced Visual Analytics

(Filters, Drill-downs, Storytelling Dashboards), OFSAA/ OAS (Oracle Financial Services Analytical Applications/ Oracle Analytics Server

• Collaboration & Reporting: Stakeholder Communication, Cross-Functional Collaboration, Executive Dashboards, Strategic Insights

WORK EXPERIENCE

Phillips 66, Houston, Texas, USA Machine Learning Engineer Dec 2024 – Present OBJECTIVE: Phillips 66 is a leading American multinational energy company operates as an integrated downstream energy provider, focusing on refining, midstream, chemicals, and marketing & specialties segments. Building robust and scalable data pipelines to ingest, transform, and deliver data to machine learning models and identifying, extracting, and transforming features from raw data to improve model performance and accuracy. Key Responsibilities and Achievements:

• Designed and implemented machine learning models to forecast energy demand, optimize refinery operations, and improve supply chain efficiency using TensorFlow, PyTorch, and Scikit-learn.

• Developed NLP-based solutions to automate the processing of maintenance logs and technical reports, utilizing BERT, GPT, and custom text classification models to enhance operational insights.

• Processed large-scale sensor and operational data from refinery units using Hadoop, Apache Spark, and HBase, enabling predictive analytics for equipment failure and energy optimization.

• Built robust ML pipelines for real-time data ingestion, preprocessing, and model serving using Apache Airflow, Kubernetes, and MLflow, ensuring efficient lifecycle management of predictive models.

• Created real-time alert systems for anomaly detection in pipeline operations using Apache Kafka, Apache Flink, and Spark Streaming, improving response times and operational safety and deployed scalable ML models on AWS using services such as Sage Maker, Lambda, and S3, ensuring high availability and secure handling of industrial data.

• Utilized Pandas, NumPy, SQL, MongoDB, and Cassandra for advanced analytics, anomaly detection, and storing large-scale time-series data from sensors and energy monitoring systems and ensured compliance with energy sector regulations by incorporating data governance policies and privacy standards like GDPR and CCPA into machine learning workflows.

• Developed insightful dashboards and visual analytics for plant performance, energy consumption, and emissions using Tableau, Power BI, and Plotly to support data-driven decision-making and used Docker and Kubernetes for containerizing and orchestrating models across distributed environments, enabling scalable deployment across various production sites.

• Applied Reinforcement Learning to optimize refinery operations such as fuel blending and energy consumption in dynamic environments and used GraphQL and REST APIs to integrate predictive services into existing control systems.

• Built predictive maintenance models using XGBoost and LightGBM, reducing unplanned downtime and improving equipment reliability across Phillips 66 energy infrastructure.

• Integrated Edge AI capabilities using TensorFlow Lite and ONNX Runtime to enable real-time inference on IoT devices monitoring refinery equipment and pipeline performance and implemented MLOps practices with Git, Jenkins, Kubeflow, and MLflow to ensure CI/CD for model development, versioning, and deployment in production energy systems. Environment: TensorFlow, PyTorch, Scikit-learn, NLP, BERT, GPT, Hadoop, Apache Spark, HBase, Pandas, NumPy, SQL, GDPR, CCPA, Tableau, Power BI, Plotly, Docker, Kubernetes, AI, GraphQL, REST APIs, XGBoost, LightGBM, MongoDB, Cassandra, Git, Jenkins, Kubeflow, MLflow, CI/CD.

McKesson Corporation, Irving, Texas, USA Machine Learning Engineer Dec 2023 – Nov 2024 OBJECTIVE: McKesson Corporation is a leading American healthcare company specializing in pharmaceutical distribution, medical supplies, and healthcare technology solutions. Analyzed and preprocessed large, often complex datasets to prepare them for model training, ensuring data quality and addressing potential biases and deploy trained models into production environments, making them available for real-time use. Key Responsibilities and Achievements:

• Developed machine learning models using Python, TensorFlow, and Scikit-learn to predict medication demand, optimize inventory levels, and reduce drug wastage across pharmacy networks and processed large-scale healthcare datasets using Apache Spark, Hadoop, and Hive to enhance operational efficiency in pharmaceutical distribution.

• Built end-to-end data pipelines with Apache Airflow for automating ingestion, transformation, and model training tasks, improving accuracy in demand forecasting for medical supplies.

• Leveraged AWS (Sage Maker, EC2, S3) and Azure Machine Learning Studio for scalable training and deployment of predictive analytics models used in clinical supply chain decision-making.

• Employed Tableau, Power BI, and Plotly to create dashboards for visualizing patient data trends, delivery metrics, and inventory flow, facilitating data-driven decisions for stakeholders and used Kafka and Apache Flink for real-time monitoring of pharmaceutical transactions, enhancing fraud detection and ensuring regulatory compliance in drug distribution.

• Utilized Docker and Kubernetes to containerize and deploy AI models in production, ensuring consistency and scalability across various healthcare applications and integrated CI/CD pipelines with Jenkins and GitHub Actions to automate model validation, testing, and deployment into production for real-time prescription verification systems.

• Designed and optimized healthcare data storage systems using PostgreSQL, MongoDB, and Cassandra to enable fast querying of patient and drug interaction data and ensured infrastructure-as-code practices using Terraform for managing cloud environments, and implemented Ansible for configuration consistency across dev, test, and production clusters.

• Conducted Natural Language Processing (NLP) using BERT and GPT to extract insights from patient feedback, incident reports, and medical documentation, aiding in service improvement.

• Applied Graph-based analytics using Neo4j and Network to map drug interactions, patient referrals, and supplier networks for better operational visibility and ensured data privacy and model compliance with HIPAA, aligning machine learning solutions with healthcare regulatory standards and patient data protection laws.

• Integrated Fast API and Flask to deploy lightweight RESTful APIs for real-time access to drug availability, prescription validation, and inventory tracking across healthcare systems and Developed time-series forecasting models using Prophet and LSTM networks to predict pharmaceutical supply demands and reduce delivery bottlenecks across hospital networks. Environment: Hadoop, Spark, Git, GitHub, Bitbucket, Tableau, Power BI, Apache Kafka, TensorFlow, PyTorch, Scikit-learn, Keras, AWS Sage Maker, Google AI Platform, Jenkins, CI/CD, Terraform, Ansible, Docker, Kubernetes, MySQL, MongoDB, PostgreSQL.

Citibank( Mu Sigma Business Solutions Pvt Ltd.), Bangalore, India Machine Learning Engineer OBJECTIVE Jan 2021 – Aug 2023

Citibank, the primary banking subsidiary of Citigroup Inc., is a leading global financial institution comprehensive range of financial products and services includes investment banking, securities brokerage, transaction services, and wealth management. Researched and implemented appropriate ML algorithms and tools for specific banking applications and designing the architecture of ML systems, including data pipelines, model training processes, and deployment strategies. Key Responsibilities and Achievements:

• Developed machine learning models to enhance credit scoring, fraud detection, and risk management, leading to improved lending accuracy and reduced financial losses and implemented real-time transaction monitoring systems using Apache Kafka, Apache Flink, and Spark Streaming, enabling prompt detection of anomalies and fraudulent activities.

• Leveraged Hadoop, Apache Spark, and Hive to process and analyse large-scale customer and transaction data, optimizing financial decision-making and product personalization and built advanced NLP models using BERT and GPT for automating financial document analysis, client sentiment classification, and chatbot improvements across banking services.

• Designed and deployed interactive dashboards using Power BI, Tableau, and Plotly to track KPIs related to loan performance, customer churn, and investment portfolio health. Utilized Google Big Query, Snowflake, and Amazon Redshift for building scalable data warehouses and analytical platforms to support data-driven strategies in investment banking and wealth management. Constructed automated pipelines using Apache Airflow and deployed models using MLflow and Kubeflow, streamlining MLOps practices for continuous integration and model lifecycle management.

• Developed and deployed time series forecasting solutions using ARIMA, LSTM, and Prophet models to predict financial trends and support portfolio optimization and containerized ML models with Docker and managed deployments using Kubernetes across cloud environments including AWS, GCP, and Azure, ensuring scalability and reliability.

• Integrated APIs using REST, GraphQL, and Fast API for secure and seamless communication between internal banking applications and customer-facing platforms and implemented data governance and security compliance frameworks in alignment with GDPR, CCPA, and banking regulations to ensure responsible handling of sensitive financial data.

• Utilized Terraform to provision cloud infrastructure and Ansible for configuration management, ensuring smooth deployment and scalability of machine learning models across global banking operations. Environment: Apache Kafka, Apache Flink, Spark Streaming, Hadoop, Apache Spark, Hive, NLP models, BERT, GPT, Power BI, Tableau, Plotly, Google Big Query, Snowflake, Amazon Redshift, Apache Airflow, MLflow, Kubeflow, ARIMA, LSTM, Prophet, Docker, Kubernetes, REST, GraphQL, Fast API, Terraform, Ansible. Niva Bupa Health Insurance Company, Bangalore, India Data Scientist / Machine Learning Engineer OBJECTIVE June 2018 – Dec 2020

Niva Bupa Health Insurance Company Limited is a prominent Indian health insurance provider offers a comprehensive range of health insurance products. Performed data collection and data modelling and involves algorithms and statistical models that identify patterns, relationships, and insights within data and recognize patterns, and judge the new incoming data. Key Responsibilities and Achievements:

• Developed and optimized machine learning models to predict patient health risks and enhance underwriting processes, using TensorFlow, PyTorch, and Scikit-learn for accurate risk assessment and pricing.

• Implemented NLP techniques to analyse customer feedback, improving claims processing and enhancing customer service interactions with sentiment analysis and text classification. Utilized big data technologies like Hadoop and Apache Spark to process and analyse vast health data, improving fraud detection and enabling data-driven decision-making for policy management. Built and deployed real-time data analytics solutions using Apache Kafka and Apache Flink for continuous monitoring of policyholder health status, ensuring timely intervention and improving customer satisfaction.

• Designed and optimized data pipelines with Apache Airflow for seamless integration of health data, enabling efficient processing for predictive modelling and personalized healthcare recommendations.

• Leveraged cloud platforms like AWS and Google Cloud to scale data processing and model deployment, enhancing the scalability and efficiency of health insurance solutions and Integrated machine learning models into the insurance platform using Flask and Fast API for seamless access to real-time predictions and automated claims processing.

• Employed containerization with Docker and Kubernetes for consistent and scalable deployment of predictive models, ensuring smooth operation across multiple environments in the insurance ecosystem.

• Applied ensemble learning techniques, including XGBoost and random forests, to predict patient health outcomes and tailor insurance plans, improving risk mitigation and personalized healthcare services. Environment: TensorFlow, PyTorch, Scikit-learn, NLP, Hadoop, Apache Spark, Apache Kafka, Apache Flink, Apache Airflow, AWS, Google Cloud, Docker, Kubernetes, XGBoost, random forests, Flask, Fast API

Contact this candidate