Post Job Free
Sign in

Data Engineer Intern

Location:
Irvine, CA
Posted:
April 07, 2025

Contact this candidate

Resume:

Robotics Engineer

*************@*****.***

479-***-****

Innovative Data Engineer with 4 years of experience building scalable ETL and real-time data pipelines across cloud environments (AWS, Azure). Proficient in Spark (Scala & PySpark), SQL, and data modeling, with a strong foundation in data architecture and analytics. Proven success in developing big data platforms that process multi-terabyte workloads and power executive dashboards.

Work Experience

Aneesh Pasnoor

Robotics Engineer

Oakley

-

Built scalable web scraping pipelines using Selenium to extract data from vendor and industry websites. Processed, cleaned, and structured unstructured scraped data using Python (Pandas, NumPy) for real-time analytics, resulting in a 98%accuracy rate for analytics.

Designed validation and quality control mechanisms to ensure consistent and reliable datasets for the manufacturing platform that reduced data inconsistencies by 35%. Worked with product design and engineering teams to define data requirements and ensure seamless delivery. Developed data dashboards using Tableau and Dash to visualize material trends and supply chain metrics. Integrated APIs and automation scripts to keep material databases updated and production ready. Resolved system bugs and documented data workflows to ensure maintainability and reproducibility. Reduced bug resolution time by 50% by documenting reproducible data flows. Used Git for version control and PowerShell for deployment automation, improving delivery speed by 20%. Oct 2024 Present

Data Engineer Intern

SUNY POLYTECHNIC INSTITUTE

-

Engineered robust ETL/ELT pipelines across AWS and GCP, integrating financial, and operational data. Built real-time and batch processing pipelines using Databricks and DBT to support RCM and actuarial workflows. Provisioned

SUNY POLYTECHNIC INSTITUTE, NY, Data Engineer Intern, 08/2021 - 04/2023. Engineered robust ETL/ELT pipelines across AWS and GCP, integrating financial, and operational data. Built real-time and batch processing pipelines using Databricks and DBT to support RCM and actuarial workflows. Provisioned

SUNY POLYTECHNIC INSTITUTE, NY, Data Engineer Intern, 08/2021 - 04/2023. Engineered robust ETL/ELT pipelines across AWS and GCP, integrating financial, and operational data. Built real-time and batch processing pipelines using Databricks and DBT to support RCM and actuarial workflows. Provisioned

Aug 2021 Apr 2023

Data Engineer

Fidelity Investments India

-

Designed Power BI dashboards with DAX for visualizing patient flow and operational metrics. Implemented CI/CD pipelines and provisioned secure resources using Infrastructure-as-Code tools. Led efforts to consolidate streaming and batch data into unified repositories for analysis. Enhanced the performance of long-running queries by indexing and optimizing complex joins. Migrated a 2TB data warehouse to AWS Redshift, cutting infrastructure cost Feb 2020 Mar 2021

Core Skills

Education

Certificates

Cloud Platforms : AWS (S3, Redshift, RDS, GCP (BigQuery, GCS, DataFlow, Dataproc, Cloud Composer, Snowflake), Apache Spark, PySpark, Snowflake, Apache Airflow, Azure Data Factory, Azure Monitor, AWS Glue, SQL (T-SQL, PL/SQL, MySQL), Python, Scala, PowerShell, C#, .NET, JavaScript, Snowflake, Azure Synapse, Azure Logic Apps, REST APIs, Google Cloud Dataflow, Snowflake Schema, Data Warehousing, Tableau, Dash, CloudWatch, Excel, Google Sheets, Splunk, GitHub Actions, Jenkins, Git, Docker, data processing systems, distributed computing, data transformation, Data Infrastructure, performance tuning, communication, Power BI SUNY POLYTECHNIC INSTITUTE -

Master’s Degree Computer and Information Science

Aug 2021 Jul 2023

AWS Certified Cloud Practitioner.

Google Data Engineer and Cloud Architect.

Big Data with Apache Spark and Python.



Contact this candidate