Post Job Free
Sign in

Data Engineer Machine Learning

Location:
McKinney, TX
Posted:
October 28, 2025

Contact this candidate

Resume:

Vennela Rayi

+1-469-***-**** • *************@*****.*** • Richardson Texas

A highly skilled Data Engineer and Analyst with practical experience in designing and optimizing ETL workflows, data models, and machine learning applications. Demonstrated expertise in utilizing Databricks, SQL, Python, Tableau, and Power BI to streamline data integration, improve decision-making, and enhance operational efficiency. Successfully implemented Star and Snowflake schema models, developed predictive machine learning solutions, and built dynamic dashboards for actionable business insights. With a consistent record of enhancing data processing efficiency and delivering strategic insights that drive product innovation and marketing success, I seek opportunities as a Data Engineer or Data Analyst to further contribute in a dynamic, growth-oriented environment.

SKILLS

Programming Languages: C, C++, JAVA, R Programming, Python (NumPy, Pandas, Matplotlib), SQL, SAS

Databases: MySQL, MS SQL Server, MongoDB (NoSQL), HADOOP (Hive, HBase)

Reporting: MS Excel, PowerBI, Tableau, Google Analytics

Project Management: Agile, Waterfall, Risk, Timeline and Budget assessment

JS Web Technologies: Experienced with HTML5, CSS3, jQuery, Competent in Git, JIRA, Eclipse IDE

Web & Cloud Technologies: AWS, Azure, Git, Eclipse IDE, Databricks

WORK EXPERIENCE

Data Engineer Aetna, Dallas June 2025 – Current

•Conducted analysis on 10M+ health insurance claims using Databricks, SQL, and Python, leveraging advanced data preprocessing and feature engineering to extract actionable insights for product and marketing strategies.

•Developed a machine learning model in Databricks using XGBoost and LSTM, achieving 80% recall in predicting behavioral health issues based on patient visit sequences and diagnoses.

•Optimized ETL workflows in Databricks, improving data processing efficiency and enabling faster decision- making for cost management and customer solutions.

•Provided strategic insights from predictive analytics, influencing product development, cost optimization, and targeted marketing initiatives.

Data Analyst Intern Aldi, Dallas Aug 2024- Dec 2024

•Designed and implemented Star and Snowflake schema models, optimizing query performance by 30% and improving data retrieval efficiency.

•Developed ETL pipelines in SQL & Databricks, ensuring seamless data integration and transformation.

•Built Tableau dashboards to track supply chain KPIs, enhancing reporting speed by 50% and improving stakeholder decision-making.

•Collaborated with teams to translate business needs into insights using SQL, Python & Tableau, leading to a 20% improvement in process efficiency.

Business Analyst Reliance, INDIA Sep 2022- Aug 2023

•Automated inventory management processes using Python and SQL, optimizing stock replenishment and reducing manual effort, which improved operational efficiency by 20%.

•Built interactive dashboards using Power BI to visualize production metrics, sales performance, and revenue growth across multiple business units, enabling data-driven decision-making.

•Designed a dynamic rebate management tool to optimize customer incentives that enabled data-driven simulations, increasing profitability by 15%.

•Collaborated with cross-functional teams including IT, finance, and sales to enhance operational efficiency.

PROJECTS

Customer Segmentation and Predictive Analytics Databricks, Python, Pandas, Scikit-learn, Power BI, Excel

•Conducted a comprehensive customer segmentation analysis on a retail dataset with over 1 million transactions across multiple cities. After the analysis retention strategies improved customer retention by 20%.

•Performed data cleaning and preprocessing, including handling missing values, scaling, and encoding, ensuring the dataset was ready for modeling. Improved data quality by ensuring 100% readiness for ML models.

•Built and evaluated machine learning models like Random Forests for predicting high-spending customer behavior. Post-analysis, high-value transactions increased by 30%.

•Designed interactive Tableau dashboards to visualize customer segments, spending trends, and CLV distribution. Insights from dashboards led to a 15% improvement in marketing ROI.

Churn Detection of Food Ordering Platforms MySQL, Excel, Python, NLP, Keras, Tensor flow, Tableau

•Extracted user tweets about UberEATS, Grubhub, and DoorDash using the Tweepy library, applying text cleaning, stemming, lemmatization, and vectorization techniques like Word2Vec, Bag-of-Words, and TF-IDF.

•Developed an artificial neural network with rectifier and sigmoid activation functions, achieving 79% accuracy, later boosted to 85% using grid search and dropout for regularization.

•Created Identified key insights, revealing UberEATS had the lowest churn ratio, while DoorDash exhibited the highest churn rate.

Health Care Provider Tool Researcher SQL, Excel, R, Power BI, Python, CSS, HTML

•Developed a tool for recommending healthcare providers based on ratings, treatment, and communication skills, increasing patient satisfaction by 25% with Perfometrics.

•Segmented a dataset of 1,000,000 providers using customer segmentation and K-means clustering, improving recommendation accuracy by 40%.

•Integrated a machine learning model with a webpage using HTML, CSS, and Django, resulting in a 80% increase in website traffic and user engagement.

Sleep Efficiency Prediction SQL, Excel, R, Power BI, Python, CSS, HTML

•Developed a predictive model for sleep efficiency in R, achieving 85% accuracy, and analyzed lifestyle and physiological data to identify sleep disorder risks.

•Provided actionable insights into lifestyle impacts on sleep quality, improving health decision- making.

•Utilized SQL for data extraction and preprocessing, ensuring data integrity and model accuracy.

•Conducted statistical analysis to identify factors influencing sleep patterns, helping in workplace wellness initiatives.

•Designed an interactive Power BI dashboard to visualize sleep trends and correlations, aiding in data-driven wellness programs.

EDUCATION

The University of Texas at Dallas, Dallas, Texas Aug 2023 – May 2025 Master of Science, Business Analytics

Rajiv Gandhi University of Knowledge and Technologies, India Aug 2019- May 2023 Bachelor of technology in Civil Engineering

CERTIFICATIONS

•Salesforce Certified Administrator

•Snowflake Hands-on Essentials

•Tableau Essentials

ADDITIONAL INFORMATION

•Strong problem-solving and analytical skills with a passion for data-driven insights Excellent communication and teamwork abilities

•Passion for leveraging analytics to optimize business performance

•Experience in mentoring junior analysts and providing knowledge transfer sessions



Contact this candidate