Data Engineer

Location:

San Diego, CA

Posted:

November 26, 2024

Contact this candidate

Resume:

VEDIKA SHRIKANT NALAWADE

San Diego, CA +1-619-***-**** ************@*****.*** linkedin.com/in/vedikanalawade/ EDUCATION

San Diego State University August 2022 – August 2024 Master of Science in Computational Science – Data Science San Diego, CA Mumbai University August 2018 – June 2022

Bachelor of Engineering in Computer Engineering Mumbai, India SKILLS

• Programming & Data Analysis: Python (Pandas, NumPy, TensorFlow), SQL, R

• Statistical Methods: Regression, clustering, anomaly detection, Machine Learning (supervised/unsupervised)

• Data Modeling & Transformation: Analytical data models, data integration, ETL pipeline development

• BI Tools: Grafana, Alteryx, Tableau, Power BI

• Big Data & ETL: Apache Spark, Hive, PySpark, scalable data frameworks

• Cloud & Data Services: AWS (S3, EC2, Glue, Redshift, Lambda), Microsoft Azure

• Data Pipelines & Automation: Apache Airflow

• Databases: SQL Server, MySQL, Oracle, PostgreSQL, MongoDB, NoSQL, REST APIs

• Environments: Linux, shell scripting, distributed systems EXPERIENCE

San Diego State University

Volunteer Data Engineer & Analyst October 2024 – Present

• Developed & deployed data pipelines, improving data integrity for predictive modeling & enabling data-driven research insights.

• Collaborated with cross-functional teams to apply advanced statistical methods and ML models to enhance research outcomes.

• Implemented monitoring and alerting systems using Grafana to ensure seamless data pipeline operations and system reliability. Graduate Teaching Assistant January 2024 – August 2024

• Led courses on Machine Learning, covering neural networks, supervised/unsupervised learning, and computer vision.

• Provided guidance on practical projects and applications, driving student success in data-driven Machine Learning models. Research Assistant - Machine Learning Engineer January 2023 – March 2024

• Conducted monthly infrared spectrum analyses using IQMOL and QCHEM, achieving a 95% accuracy rate.

• Developed Python-based machine learning models, improving spectroscopic data interpretation by 20%.

• Processed 500GB of annual data annually with advanced ML and statistical techniques, maintaining under 5% error rate. Udyog Mart

Data Analyst August 2021 – June 2022

• Built Apache Spark ETL pipelines, reducing data processing times by 50% and enabling real-time analytics.

• Optimized ML models and sensor analytics, cutting downtime by 25% and improving predictive maintenance.

• Automated workflows with Alteryx, PostgreSQL, and AWS Redshift, boosting throughput by 35% and reliability. Servify

Data Engineer Intern January 2021 – June 2021

• Built scalable Python and SQL ETL pipelines utilizing Hive, Kafka, PySpark, and AWS Redshift for large-scale data processing.

• Enhanced pipeline performance by 75% & data precision by 80% through integration with MongoDB, PostgreSQL, & AWS Redshift.

• Deployed Apache Airflow to automate workflows, boosting scalability and UAT efficiency. RESEARCH & PUBLICATIONS

Machine Learning-Based Yield Prediction Python, TensorFlow, PyTorch, Power BI

• Developed ML models (GRU, Hybrid RNN-RF-XGBoost, LSTM), improving prediction accuracy by 25% over ARIMA/SARIMA.

• Processed and analyzed 1M+ data points, reducing forecast errors by 20-25%, enhancing agricultural planning efficiency. Plastic Detection and Classification using Deep Learning CNN, Python, TensorFlow, Keras, OpenCV

• Engineered a CNN model with VGG-16, achieving 92% classification accuracy for plastic detection.

• Optimized GPU acceleration, reducing training time by 40% & increasing robustness by 15% through extensive testing. PROJECTS

Interactive Sales Growth Dashboard Tableau, SQL

• Designed Tableau dashboards analyzing $733K in sales, $93K in profit, and 1.6K orders, driving 20% YOY growth.

• Conducted SQL-based customer behavior analysis, increasing retention by 15% and total orders by 28%.

• Created dynamic visualizations for profit/loss trends, enabling faster 25% improvement in decision-making. Cloud-Driven Revenue Insights Platform PostgreSQL, Power BI, Python, ETL

• Developed an end-to-end Power BI dashboard integrating multi-source PostgreSQL data, increasing room sales by 15%.

• Streamlined Python ETL workflows, achieving a 40% boost in data retrieval speed and reducing errors by 25%.

• Applied predictive statistical models to optimize pricing strategies, resulting in a 10% revenue growth per quarter. Customer Segmentation and Predictive Analytics Engine PostgreSQL, R, Python, AWS Sagemaker

• Conducted advanced segmentation analysis using SQL, R, and Python, improving customer retention by 20%.

• Optimized ML models (Random Forest, K-Means) on AWS Sagemaker to enhance targeting and profiling accuracy.

• Designed scalable data pipelines and Power BI dashboards, raising customer satisfaction by 15% through actionable insights.

Contact this candidate