Post Job Free
Sign in

Data Engineer Machine Learning

Location:
Los Angeles, CA
Posted:
April 21, 2025

Contact this candidate

Resume:

TEJAS SHETH

Los Angeles, CA +1-213-***-**** *****.*******@*****.*** LinkedIn GitHub

SUMMARY

Data Engineer with a strong background in data pipelines, cloud computing, and analytics. Proficient in Python, SQL, and AWS, with experience in building scalable ETL processes, real-time data processing, and database optimization. Passionate about leveraging data-driven solutions to optimize performance and drive insights.

TECHNICAL SKILLS

Languages & Libraries: Python, SQL, C++, Java, JavaScript

Data Engineering: Apache Spark, Hadoop, Airflow, Kafka, Pandas, NumPy

Databases: PostgreSQL, MySQL, MongoDB, SQLite, BigQuery

Cloud and DevOps: AWS (EC2, S3, Lambda, RDS), Docker, Kubernetes, Terraform

Tools & Technologies: Git, GitHub, Jira, Slack, Google Colab, Tortoise SVN, OpenCV

Visualization: Tableau, Power BI, Matplotlib, Seaborn

Machine Learning: Scikit-learn, TensorFlow, PyTorch

PROFESSIONAL EXPERIENCE

Data Engineer Virtual Human Therapeutics Lab, USC Institute for Creative Technologies Jan 2024 - Jan 2025

Brave Mind: Refactored legacy software architecture, reducing system bugs by 40% and increasing performance efficiency by 25%

Developed AWS-based cloud storage and retrieval systems, optimizing data availability and reducing downtime

Battle Buddy: Engineered scalable iOS modules in Unity 3D with Python and C#, integrating long-press functionality via Unity's Input System and intuitive drag-and-drop interactions, improving UI responsiveness by 35%

Software Developer, Team Lead Urban Future’s Data Core, USC Sol Price School of Public Policy Aug 2023 - Dec 2023

Led a 6-member team to construct large-scale web scraping pipelines in Python, increasing research data collection efficiency by 60%

Developed a high-performance backend infrastructure, ensuring reliable storage and retrieval (95% uptime) of large research datasets

Designed RESTful APIs to streamline data access, reducing data retrieval latency by 45%, Leveraged Python visualization libraries

Developed a custom HTML parsing system with BeautifulSoup4 and xml, boosting data extraction efficiency by 40%

Deployed AWS-based data pipelines, integrating EC2 and S3 for scalable research data processing, reducing computational costs by 30%

Data Analyst Intern NSP Ites Pvt. Ltd July 2021 - Jan 2022

Designed Python-based web scraping tools, integrating AWS S3 to improve data collection efficiency.

Created interactive dashboards using Tableau to visualize key business insights.

Designed predictive models using machine learning algorithms, improving sales forecasting accuracy by 20%.

Engineered a scalable backend infrastructure using Python and Flask, processing 500,000+ data points daily with sub-200ms latency.

Built an automated Selenium-driven scraping solution, slashing manual data collection efforts by 30%

FEATURE PROJECTS

Data Engineer Real-time Motion Detection System GitHub Aug 2022 – Dec 2022

Created an advanced motion detection system utilizing Python, OpenCV, and AWS, enabling remote monitoring with 98% accuracy

Integrated AWS S3 for cloud-based storage, ensuring seamless accessibility, scalability and efficient data processing.

Machine Learning Research Detection of SQL Injection using Reinforcement Learning GitHub Aug 2022 - May 2023

Led a 4-person team to design a Q-Learning-based Reinforcement Learning model, improving SQL injection detection by 30%

Constructed a feature extraction pipeline for SQL queries using Natural Language Processing (NLP) techniques

Achieved 95% accuracy in detecting and mitigating advanced SQL injection attacks

Curated a training dataset of 10,000+ SQL injection patterns for model training and evaluation

Published research presented at IEEE Conference - Link and Link

Software Developer Reverse Image Querying GitHub Jan 2022 - May 2022

Devised a cosine similarity-based image retrieval model, elevating image recognition accuracy to 60%+

Built an efficient query processing pipeline, accelerating image search performance, reducing retrieval time by 50%

Published research in IJRASET Journal with 500+ views - Link

EDUCATION

Master of Science (M.Sc) in Computer Science at University of Southern California Los Angeles, CA GPA: 3.7

Bachelor of Engineering (B.Engg) in Computer Engineering at University Of Mumbai Mumbai, IN CGPA: 9.33/10.0



Contact this candidate