Bhadrappa Molgi
+1-818-***-**** — # ****************@*****.*** — ï in/bharadwaj-m-4b0b54113 — * San Diego, CA 92130 Summary — Principal Data/AI Platform Engineer with 14+ years of experience in data engineering, AI/ML, and large-scale cloud platforms. Proven expertise in designing and delivering petabyte-scale data ecosystems, modernizing legacy pipelines, and enabling AI-driven decision-making across healthcare, retail, manufacturing, and energy domains. Skilled in end-to-end ML lifecycle management—from data ingestion and feature engineering to model training, deployment, and monitoring—using Vertex AI, TensorFlow, and Scikit-learn. Recognized for building resilient, cost-optimized, and future-proof architectures on Google Cloud Platform, delivering solutions that cut runtime and cost while improving scalability and reliability. Adept at leading migrations from SAP, Oracle, and Hadoop to modern GCP stacks (BigQuery, Dataflow, Dataproc, DBT) and operationalizing GenAI, NLP, and predictive analytics pipelines. Known for driving cross-functional collaboration, mentoring teams, and implementing best practices in CI/CD, testing, and MLOps. Skills
Cloud Platforms: Google Cloud Platform (BigQuery, Dataflow, Dataproc, Cloud Run, Pub/Sub, Kubernetes Engine, Dataplex, Cloud Composer, BigTable), AWS (S3, Redshift, SageMaker, Lambda, EMR), Azure (Synapse, Azure ML, Data Factory)
Data Engineering: DBT (BigQuery/Postgres), Apache Beam (Java/Python), Spark (batch & streaming), Hive, Pig, Delta Lake, Iceberg, Airflow/Cloud Composer, Airbyte, Snowflake, Databricks, Kafka Connect, NiFi AI/ML Vertex AI, Generative AI/ML on GCP, TensorFlow, PyTorch, Scikit-learn, XGBoost, LightGBM, CatBoost, MLflow, Kubeflow, BigQuery ML, H2O.ai, Automated Model Training/Tuning, MLOps (CI/CD, monitoring, retraining)
Data Science Looker, Looker Studio, Google Analytics, GA4 Integrations, DuckDB, Power BI, Tableau, Qlik, Superset, Matplotlib, Seaborn, Plotly, D3.js
Data Streaming &
ETL:
Kafka, Flink, Spark Structured Streaming, Reverse ETL (DBT, GraphQL), Kinesis, Pub/Sub, NiFi, Talend, Informatica, Stitch, Fivetran
Other Tools: Presto/Trino, Great Expectations, Forseti Security, Terraform, Ansible, Jenkins, GitHub Actions, Docker, Kubernetes, Helm, Datadog, Prometheus, Grafana, CI/CD pipelines Experience
Atomic Cloud LLC Feb 2023 – Present
Principal Data, AI/ML Engineer Remote
– Architected and delivered cloud-native data platforms for clients migrating from on-prem systems to GCP.
– Built end-to-end ingestion frameworks supporting structured/unstructured data from APIs, files, and databases at scale.
– Operationalized GenAI/ML pipelines on Vertex AI, including anomaly detection, customer segmentation, and forecasting.
– Migrated legacy ETL pipelines into DBT and GraphQL reverse ETL, reducing latency for downstream analytics.
– Designed cost-optimized architectures that cut query runtime by 40% while maintaining scalability.
– Partnered with cross-functional teams to align data strategy with business goals across healthcare, retail, and finance domains.
– Mentored junior engineers on CI/CD, testing, and ML pipeline automation. Environment: GCP (BigQuery, Dataflow, Dataproc, Pub/Sub, Cloud Run, Composer, GKE), DBT, GraphQL, Vertex AI, Python, Java, Docker, Kubernetes.
Cloud Karma Dec 2021 — Feb 2023
Principal Data Engineer Remote
– Designed and deployed customer analytics and data science solutions for Wayfair (e-commerce) and Driven Brands
(automotive services).
– Built predictive ML models for churn reduction, personalized recommendations, and sales forecasting using Spark MLlib
& BigQuery ML.
– Migrated legacy pipelines into modern GCP-based architectures, improving pipeline efficiency and scalability.
– Delivered interactive Looker dashboards enabling real-time visibility into customer journeys and KPIs.
– Collaborated with data scientists to bring models into production using MLOps best practices.
– Led workshops to train business teams on interpreting AI-driven insights for decision-making.
– Reduced analytics query time by 55% while cutting infrastructure cost through optimization. Environment: GCP (BigQuery, Dataflow, Dataproc), Spark, MLlib, BigQuery ML, Looker Studio, Airflow, Python, SQL. Principal Data Engineer
Senior AI/ML Engineer
Tech Bridge Partners Dec 2020 -– Dec 2021
Principal Data Engineer Remote
– Supported Kohl’s retail operations by engineering real-time data pipelines for mobile apps and in-store devices.
– Streamlined ingestion into MongoDB and Cloud SQL, enabling near real-time reporting for store operations.
– Integrated GA4 with Looker Studio for advanced marketing attribution, improving ROI tracking.
– Built pipelines to support demand forecasting models, optimizing inventory planning and reducing stockouts.
– Partnered with analytics teams to design data marts tailored to retail business needs.
– Enhanced monitoring and alerting frameworks, improving SLA adherence by 25%. Environment: GCP (Dataflow, Pub/Sub, BigQuery), Looker Studio, GA4, MongoDB, Cloud SQL, Airflow, Python. SunSoft Technologies Inc Jun 2020 -– Dec 2020
Senior Staff Data Engineer Remote
– Partnered with Premier Inc. (healthcare) to ingest patient-level data for predictive healthcare analytics.
– Built pipelines using CDC (Change Data Capture) to process sensitive healthcare data in compliance with HIPAA standards.
– Enabled risk scoring models for patient outcomes by integrating clinical and operational data.
– Deployed Forseti Security frameworks for cloud data governance and compliance monitoring.
– Delivered data models supporting executive dashboards for clinical decision-making. Environment: GCP (BigQuery, Dataflow), CDC, Forseti Security, Python, SQL. SunSoft Technologies Inc Jan 2016 — Jun 2020
Staff Data Engineer Houston, TX
– Worked with Schlumberger (oilfield services) to design large-scale Hadoop and GCP-based data lakes.
– Migrated multi-terabyte workloads into Dataproc & BigQuery, enabling faster analytics for drilling operations.
– Designed time-series models for predictive equipment maintenance, reducing downtime by 18%.
– Developed ingestion frameworks handling billions of streaming records per day.
– Collaborated with geoscience teams to support advanced reservoir analytics. Environment: Hadoop, GCP (Dataproc, BigQuery, Composer), Spark, Airflow, Python, Java, SQL. Bell Info Solutions Jun 2015 — Jan 2016
Senior Data Engineer Minneapolis, MN
– Supported UnitedHealth Group (healthcare) by modernizing legacy analytics systems.
– Migrated SAS-based pipelines into Python/Scikit-learn workflows, reducing licensing costs and improving model flexibility.
– Partnered with actuaries and data scientists to enhance population health and claims analytics.
– Designed predictive pipelines for fraud detection and member risk scoring. Environment: Designed predictive pipelines for fraud detection and member risk scoring. Virginia Polytechnic Institue and State University Dec 2014 – May 2015 Graduate Assistant Blacksburg, VA
– Conducted research in operations research, simulation, and ML for logistics and scheduling.
– Developed predictive models improving efficiency in supply chain resource allocation.
– Assisted in teaching graduate courses in optimization and data analytics. Environment: R, Python, MATLAB, SQL.
Michelin Jun 2011 – Aug 2014
Senior Data Analyst Chennai, India
– Analyzed manufacturing machine data to develop predictive maintenance models, reducing downtime.
– Automated reporting pipelines to deliver insights on operational performance.
– Partnered with production engineers to optimize processes using data-driven decision-making. Environment: SQL, Excel, Tableau, Python.
Michelin Jan 2012 – Aug 2012
Technical Analyst Greenville, SC
– Supported US operations by designing reports and KPIs for factory operations.
– Conducted data validation and provided actionable insights to plant managers. Environment: SQL, Excel, VBA, Python
Education
Virginia Tech 2014–2015
M.S., Industrial & Systems Engineering (Operations Research) Vellore Institute of Technology 2007–2011
B.Tech, Electrical & Electronics Engineering