SUMMARY
JWALA BHARATH MARRIPUDI
Irving, TX, ****3, USA +1-940-***-**** ************@*****.*** LinkedIn 3+ years of experience in AI/ML and Data Engineering in healthcare and financial domains, designing and deploying scalable data pipelines, machine learning models, and cloud-native solutions across AWS, Azure, and GCP. Proficient in Python, PySpark, SQL, and Big Data frameworks (Hadoop, Spark, Kafka, Hive) to process large-scale datasets, build real-time streaming solutions, and enable data- driven insights. Skilled in ETL, MLOps, CI/CD, and data governance, with hands-on experience in Snowflake, Redshift, Delta Lake, and analytics platforms (Tableau, Power BI, QuickSight). Certified in AWS Data Engineer Associate and Databricks Data Engineer, with a track record of delivering production-grade AI/ML solutions, automating workflows, and ensuring compliance with data security and regulatory standards.
TECHNICAL SKILLS
Programming: Python, SQL, PySpark, Scala, Shell Scripting Big Data Ecosystem: Hadoop, Spark (Core, SQL, Streaming), Hive, Kafka, HDFS, Airflow Cloud Platforms: AWS (S3, EMR, Glue, Redshift, Lambda, Athena, Kinesis, SNS, CloudWatch), Azure (ADF, ADLS, Databricks, Synapse Analytics)
Databases & Warehousing: MySQL, SQL Server, Snowflake, PostgreSQL, Oracle, Teradata ETL / Data Integration: AWS Glue, Azure Data Factory, Apache Airflow, Informatica, Talend AI / Machine Learning: Scikit-learn, TensorFlow, PyTorch, MLflow, XGBoost, Feature Engineering, Model Deployment, MLOps, Hyperparameter Tuning
Data Modeling & Processing: Data Lake, Data Warehouse, Data Pipeline Design, Data Governance, Data Quality, Delta Lake, Parquet, Avro CI/CD & DevOps Tools: Jenkins, Git, GitHub Actions, Docker, Kubernetes, Terraform Visualization & Analytics: Tableau, Power BI, QuickSight, Matplotlib, Seaborn Other Tools & Concepts: REST APIs, JSON, Agile/Scrum, SDLC, Performance Optimization, Distributed Computing EXPERIENCE
BCBS USA
AI/ML Engineer July 2024 – Present
• Design and deploy scalable machine learning models for healthcare analytics, risk assessment, and member segmentation using Python, Scikit-learn, TensorFlow, and PyTorch in cloud environments.
• Develop and maintain end-to-end ML workflows using MLflow, Apache Airflow, and Docker/Kubernetes, ensuring reproducible model training, versioning, and automated deployment.
• Build and optimize data pipelines with PySpark, SQL, and Kafka to process large-scale claims, clinical, and operational datasets, ensuring data quality, integrity, and high availability.
• Implement MLOps best practices including CI/CD, model monitoring, and drift detection using GitHub Actions, Jenkins, and CloudWatch, enabling continuous improvement of AI/ML models.
• Collaborate with data engineers, analysts, and business stakeholders to translate complex healthcare business requirements into production-ready AI/ML solutions that enhance operational efficiency and member outcomes.
• Optimize model performance and explainability using XGBoost, SHAP, and advanced feature engineering, ensuring compliance with healthcare regulations and data governance standards. Capital One TX, USA
Data Engineer (Contract) Apr 2023 – May 2024
• Designed and implemented scalable ETL pipelines using PySpark, Airflow, and AWS Glue to process and integrate high-volume transactional data from diverse banking systems into Snowflake and Redshift, improving data accessibility for analytics teams.
• Built and optimized data lakes and data warehouses on AWS S3 and Azure ADLS, leveraging Spark SQL, Hive, and Delta Lake to ensure high performance, data governance, and cost efficiency across cloud environments.
• Developed real-time streaming solutions using Kafka and Kinesis to enable near real-time insights for fraud detection and customer behavior analytics, enhancing data latency by 40%.
• Collaborated with cross-functional teams to implement CI/CD pipelines (Jenkins, GitHub Actions, Docker, Kubernetes) for automated testing, deployment, and monitoring of data workflows, ensuring reliability and scalability across production environments.
UnitedHealth Group Pune, INDIA
Data Analyst April 2021 – June 2022
• Analyze and interpret large-scale healthcare datasets using SQL, Python, and PySpark to deliver actionable insights on patient outcomes, cost optimization, and operational efficiency across multiple business units.
• Develop and maintain interactive dashboards and visual reports in Tableau, Power BI, and QuickSight, enabling leadership to track key performance metrics and improve data-driven decision-making.
• Build automated ETL workflows using AWS Glue, Airflow, and Azure Data Factory to extract, transform, and load data from diverse clinical and claims systems into centralized Snowflake and Redshift repositories.
• Collaborate with data engineers and data scientists to ensure data quality, governance, and model-ready feature sets, leveraging Delta Lake, Parquet, and data validation frameworks to enhance reporting accuracy and model performance. PROJECTS
SQL Injection Detection and Prevention System
• Develop a web-based application using Python, Flask, and MySQL to simulate SQL injection attacks and demonstrate how malicious queries exploit database vulnerabilities.
• Implement detection and prevention techniques such as parameterized queries, input validation, and stored procedures, ensuring secure database interactions and compliance with OWASP best practices.
• Integrate real-time monitoring and logging mechanisms to identify suspicious query patterns and strengthen data security, ethical hacking awareness, and secure coding standards. Major Project: Hand Gesture Controlled Robot
• Design and develop a gesture-controlled robot using Arduino, accelerometer sensors, and Python, enabling intuitive control of robotic movement through real-time hand gestures.
• Integrate Bluetooth communication and microcontroller programming to achieve wireless control for automation tasks and operations in hazardous or remote environments.
• Implement data processing and calibration algorithms to enhance gesture recognition accuracy and ensure smooth, responsive robotic performance.
Mini Project: Voice-Controlled Robot
• Develop a voice-enabled robot using Raspberry Pi, Arduino, and Python, allowing the robot to interpret and respond to natural language commands in real time.
• Integrate speech recognition APIs and wireless communication modules to enable hands-free control for healthcare assistance and home automation.
• Implement command processing and action execution algorithms to ensure accurate, reliable, and efficient robot responses, enhancing usability and operational efficiency.
CERTIFICATIONS
• AWS Data Engineer Associate
• Databricks Certified Data Engineer Associate
EDUCATION
Campbellsville University –
Master’s, Computer Science GPA: 4.0
SCSVMV University –
Bachelor’s, Electronics & Communication Engineering GPA: 9.3