Post Job Free
Sign in

Machine Learning Data Engineer

Location:
Houston, TX, 77002
Posted:
June 25, 2025

Contact this candidate

Resume:

Sheethal Bandari

Houston, Texas +1-713-***-**** *******.********@*****.*** https://www.linkedin.com/in/bandarisheethal SUMMARY

Highly motivated and detail-oriented data enthusiast with a master’s degree in data science focusing data engineering and analytics on and hands-on experience as a Data Engineer Intern. Proficient in data extraction, cleaning, transformation, and analysis, with a strong foundation in SQL, Python, data visualization. Eager to apply technical skills and contribute to data-driven decision-making in a challenging and innovative environment. Passionate about leveraging data to solve complex problems and drive business insights. TECHNICAL SKILLS

• Programming Languages: Python (Pandas, NumPy, Scikit-learn), R, SQL, Java, C++, MATLAB

• Databases: MySQL, SQL Server, MongoDB

• Cloud Platforms: AWS, Azure, Google Cloud

• Big Data Technologies: Hadoop, Spark

• Machine Learning: TensorFlow, PyTorch, Scikit-learn, NLTK, spaCy, Transformers

• Data Visualization: Tableau, Power BI, Matplotlib

• Web Frameworks: Django, Flask

• Project Management & Collaboration: Jira, Confluence, Agile, SDLC

• Other: Git, OpenCV, SAS, Data Mining, Statistical Analysis, Data Manipulation, Data Cleaning and Preprocessing, Deployment, Testing and Debugging

PROFESSIONAL EXPERIENCE

• Hydoodle Technologies Pvt.Ltd, India – Data-Driven Decision Support System JUN 2023 - DEC 2023

• Engineered scalable ETL pipelines using Python (Pandas, NumPy) & Apache Airflow, reducing data ingestion time by 20%.

• Automated cloud infrastructure on AWS with Terraform, decreasing deployment time by 25%.

• Designed PostgreSQL data models optimized for analytical workloads, improving query performance by 15%.

• Developed interactive dashboards in Tableau & Power BI for data-driven decision-making.

• Implemented machine learning models (Scikit-learn, TensorFlow) in Python, improving prediction accuracy by 10%.

• Utilized Git for version control & Agile methodologies with Jira for project management.

• Applied data mining & statistical analysis with R & SAS to identify key business trends.

• Implemented data cleaning and preprocessing techniques using Python & SQL for data accuracy.

• Collaborated with cross-functional teams to deliver data solutions.

• Ensured data accuracy & consistency for reliable analysis and reporting. CERTIFICATIONS

• Cybersecurity Internship Oct – Dec 2021

• AI - ML Virtual Internship Mar – May 2022

• Data Analytics Virtual Internship May – Jul 2023

• Forage, Commonwealth Bank – Introduction to Data Science Job Simulation May 2025 PROJECTS

COVID-19 Impact on Student Attendance and Campus Activities Summary:

• Developed a dashboard to track COVID-19’s impact on campus operations.

• Pulled daily COVID statistics from external APIs and merged them with college attendance and event data to visualize shifts in campus engagement.

• Provided reports to help college departments plan safer on-campus activities.”

• Skills/Technologies: Python, APIs, Streamlit, Plotly Sentiment Classification using Bi-LSTM, Bi-GRU and Attention based Models Summary:

• Implemented and compared deep neural network models (Bi-LSTM, Bi-GRU, Attention-Based RNNs) for binary sentiment classification on Yelp and Twitter datasets using PyTorch.

• Enabled model interpretability by adding soft attention mechanisms to highlight significant words.

• Preprocessed large volumes of text data, implemented tokenization using Hugging Face Longformer, and addressed class imbalance.

• Conducted model evaluation with Accuracy, Precision, Recall, and F1-Score, and visualized model performance using confusion matrices and attention heatmaps.

• Employed extensive hyperparameter tuning and sequence length analysis to achieve optimum model performance.

• Skills/Technologies: Python, NLP, Scikit-Learn, Hugging Face Transformers, Deep Learning, PyTorch Network Intrusion Detection System Using Ensemble Methods and Deep Neural Networks Summary:

• Developed a data-driven intrusion detection system by integrating ensemble models (Random Forests, Decision Trees, Extra Trees) with ANNs using Python.

• Implemented a stacking ensemble architecture to enhance classification accuracy and reduce false positives, achieving up to a 98% detection rate.

• Designed and trained a feedforward ANN using TensorFlow, optimizing performance with ReLU, Sigmoid activations, and the Adam Optimizer.

• Published a research paper in ICEAT 2023, IEEE 5th batch, and received a Certificate of Presentation at the 4th International Conference on Engineering and Advancement in Technology.

• Skills/Technologies: Python, Scikit-Learn, TensorFlow, Keras, Pandas, PCAP (Packet Capture), ANN, Ensemble Models

Ride On (Taxi Management System)

Summary:

• Designed and managed the database architecture for a full-stack Taxi Management System, ensuring efficient ride booking, driver assignment, customer registration, and payment tracking.

• Developed and optimized complex SQL queries to support key business insights, including driver earnings reports, ride activity analytics, and identification of inactive drivers.

• Ensured data integrity, scalability, and performance through robust schema design and normalization practices.

• Integrated third-party services such as Twilio API for SMS notifications and OpenStreetMap for live location tracking to enhance the system's real-time capabilities.

• Collaborated with a cross-functional team of 10 members to align database functionality with frontend and backend operations.

• Skills/Technologies:SQL, Database Design, Database Management, Twilio API, OpenStreetMap API, Data Modeling, Data Normalization, Team Collaboration

EDUCATION

• University of the Houston, Houston, USA MAY 2025 Master of Science in Data Science

• Sreenidhi Institute of Science & Technology, Hyderabad, INDIA JUN 2023 Bachelor of Technology in Electronics and Computer Engineering



Contact this candidate