Summary
Venkata Subbareddy Lebaku
+1-773-***-**** ***********************@*****.*** linkedin.com/in/lvsubbareddy Results-driven Data Engineer with 2+ years of experience designing, building, and optimizing large-scale data pipelines, cloud-based data platforms, and ETL workflows. Skilled in SQL, Python, Spark, Databricks, AWS, and Power BI, with expertise in data modeling, infrastructure automation, and real- time analytics. Adept at collaborating with cross-functional teams to deliver scalable, secure, and cost-optimized data solutions that drive business insights across energy trading, supply chain, and commercial analytics domains. Education
Webster University, San Antonio, TX Aug 2023 – Aug 2025 Master of Arts in Information Technology and Management - Coursework: Data Visualization CGPA: 3.54/4 New Horizon College of Engineering, Bangalore, India Aug 2019 – May 2023 Bachelor of Science in Computer science and engineering – Coursework: Python, Computer Networks, DBMS CGPA: 3.52/4 Technical Skills
• Programming & Query Languages: Python, Java, Scala, SQL, T-SQL, DAX
• Data Engineering & ETL Tools: Databricks, Azure Data Factory, Spark, Delta Lake, Kafka, Snowflake, Synapse Analytics
• Databases & Warehousing: SQL Server, MySQL, PostgreSQL, Google BigQuery
• Analytics & BI: Power BI (DAX), Tableau, Scikit-learn (feature engineering, model building, evaluation)
• DevOps & Collaboration Platforms: Azure DevOps (CI/CD), GitHub, JIRA, Confluence, Microsoft Teams
• Core Competencies & Soft Skills: Data quality validation, pipeline debugging, anomaly detection, stakeholder communication, problem-solving, adaptability, agile delivery, teamwork, documentation Experience
Data Engineer Aug 2023 – Present
Oracle – USA Remote
• Designed and automated ETL pipelines integrating ERP, CRM, and market data into a cloud-based data warehouse, improving data availability for trading and analytics teams.
• Built and optimized SQL queries, Spark jobs, and Databricks workflows to process structured and unstructured datasets, reducing pipeline latency by 30%.
• Developed Power BI dashboards and data models to support P&L monitoring, compliance metrics, and logistics KPIs.
• Implemented data governance and quality checks with automated alerts for missing, duplicate, or delayed data.
• Partnered with traders, risk managers, and data scientists to design scalable data products supporting predictive analytics and forecasting models.
• Enhanced pipeline monitoring and incident resolution by implementing observability solutions with logging and alerts. Data Engineer Intern Jan 2023 – Jun 2023
Oracle – INDIA Bangalore, India
• Automated procurement workflows with SQL/ETL scripts, cutting errors by 30% and reducing processing time by 20%.
• Built Tableau dashboards for vendor performance, inventory aging, and shipment monitoring, improving visibility across supply chain operations.
• Conducted root-cause analysis on delayed shipments, recommending layout changes that reduced dwell times.
• Supported Agile delivery by collaborating with product owners, developers, and engineers in sprint planning and backlog grooming. Projects
Zillow House Value Prediction SQL, Python, Power BI
• Designed ETL pipelines to extract and preprocess housing datasets.
• Developed ML models in Python with 90% accuracy for price prediction.
• Built Power BI dashboards for interactive exploration of geographic price trends. Rice Leaf Disease detection TensorFlow, OpenCV, Convolutional Neural Networks (CNN)
• Developed a CNN model to detect rice leaf diseases from images with over 92% accuracy.
• Used OpenCV for real-time image processing and segmentation.
• Applied the model in an agricultural context to improve early disease diagnosis. AI-Powered Delivery Route Optimization Python, Pandas, Geopandas, Matplotlib, Data bricks
• Conducted logistics and delivery data analysis in Databricks to pinpoint regions with frequent delays.
• Developed and implemented Python optimization scripts to design more efficient delivery routes.
• Applied geospatial analysis through GeoPandas and generated visualizations using Matplotlib. Certifications
• SQL – C O U R S E R A
• Microsoft Power BI Data Analyst Associate (PL-300) Involvements and Achievements
• Led over 10+ social service campaigns with the LEO Club, mobilizing 200+ volunteers to execute community initiatives.
• Recognized for leadership and team coordination across multiple cross-functional events.