EXPERIENCE
Data Analyst San Diego, CA
Enhanced operational efficiency by designing and implementing automated data extraction workflows using n8n, integrating both vision-based and HTML-based scraping agents, accumulating detailed information for more than 300,000 products.
Utilized ScrapingBee to generate high-quality webpage screenshots & integrated Gemini AI for precise, structured data extraction from visual content, significantly boosting extraction accuracy & reliability while reducing manual data collection efforts by 70%.
Developed robust validation workflows within n8n to ensure data integrity and consistency, maintaining rigorous data standards across all scraped datasets and accelerating database growth through automated scraping processes.
Architected and maintained a scalable ETL pipeline to ingest, transform, and load scraped product data into a PostgreSQL data warehouse, applying data normalization, partitioning, and indexing strategies to improve query performance by 50%.
SDSU’s Digital Innovation Lab Sept 2024 – Dec 2024
Research Data Analyst San Diego, CA
Conducted data quality reviews and developed a pipeline using Python and Selenium to automate leadership engagement data extraction, collecting over 10,000 posts with 99% accuracy.
Ensured data integrity through analytical cross-referencing and deduplication, validating large-scale data for in-depth analysis.
Designed Tableau dashboards to monitor key performance indicators and trends, leveraging QA checks to enhance data accuracy.
A2Z POS Jan 2024 – July 2024
Data Analytics Intern San Diego, CA
Analyzed service costs using AWS Cost Explorer API and SQL, achieving a 15% reduction in expenses and improving budget forecasting. Designed a live Power BI dashboard with real-time MySQL updates, enhancing stakeholder decision-making.
Built a PowerApps application with integrated Power Automate workflows, streamlining task tracking, project mapping, and resource management, reducing manual effort by 30% and optimizing workforce allocation.
Developed data pipelines in SharePoint for seamless synchronization across applications, collaborating with teams to refine workflows and dashboards, resulting in a 20% efficiency boost.
Wizeal July 2021 – July 2022
Data Analyst Mumbai, India
Developed and maintained ETL pipelines integrating data from security equipment logs, scheduling systems, and power equipment usage metrics using Python, SQL, and API integrations, ensuring timely and accurate datasets for analysis.
Analyzed equipment utilization and manpower efficiency, identifying cost-saving opportunities and operational bottlenecks, leading to a
~15% improvement in resource allocation; optimized MySQL and Oracle databases to support data-driven decisions.
Built Power BI KPI dashboards, enhancing visibility into service performance and client satisfaction rates.
EDUCATION
Master of Science in Information Systems: San Diego State University San Diego, CA
Bachelor of Engineering in Electronics & Telecommunication: University of Pune Pune, MH, India
PROJECTS
Online Retail Sales Clustering Model Sept 2024
Leveraged the UCI “Online Retail” dataset (500K+ records), performed data cleaning (addressing missing values, duplicates, and outliers), and engineered features (e.g., TotalPrice) to segment customers based on purchasing behaviors.
Performed exploratory data analysis with Python (pandas, NumPy, matplotlib, seaborn, plotly) to identify sales trends, customer distributions, and high-value segments, and applied K-Means to reveal peak sales hours and top customer cohorts. [GitHub]
Data Analytics Project (Tokyo Olympics) July 2024
Developed an end-to-end data pipeline leveraging Azure Data Factory, Databricks (Apache Spark), and Synapse Analytics to process comprehensive datasets covering 11,000 athletes, 47 disciplines, and 743 teams from the Tokyo Olympics.
Conducted SQL-based analytics to analyze medal counts and participation rates, creating interactive dashboards in Power BI that enhanced decision-making efficiency by 25%. Optimized data pipelines to reduce data latency by 30%. [GitHub]
Sales Insights Project April 2024
Built a dynamic sales dashboard in Power BI connected to MySQL, delivering real-time insights for Atliq Hardware and improving data accuracy by 20% through optimized ETL processes.
Enhanced operational efficiency and collaboration by deploying interactive dashboards to the cloud, granting stakeholders seamless access, refining sales strategies with actionable KPI insights, and fostering strategic alignment. [GitHub]
SKILLS
Analysis & Visualization Tools: Power BI, Tableau, Excel, PowerApps, Power Automate, SharePoint, n8n Database Management Systems: Microsoft SQL Server, MySQL, Oracle, PostgreSQL, NoSQL, AWS, Microsoft Azure Programming Technologies: Python (Pandas, NumPy, Matplotlib), R, SQL, APIs