Resume

Big Data Processing

Location:

Austin, TX

Posted:

April 22, 2024

Contact this candidate

Resume:

Bhargav Kirit Vadgama ad466v@r.postjobfree.com 201-***-**** www.linkedin.com/in/bhargav-vadgama EDUCATION:

Stevens Institute of Technology – Hoboken, USA August 2019-May 2021 Master of Science, Engineering Management

University of Mumbai – Mumbai, India August 2013-May 2017 Bachelor of Engineering, Electronics & Telecommunication TECHINCAL SKILLS:

Big Data: Apache Spark, Apache Airflow, PySpark, Spark Scala, Apache Hadoop, MapReduce, Pig, Hive, HDFS, HBase, Databricks.

Cloud services: ACI (Apple Cloud Infrastructure), Azure Data Factory (ADF), Azure Datalake (ADLS), AWS (Amazon Web Services), EMR (Elastic Map Reduce), S3 (Simple Storage Service), Lambda (serverless), ECS (Elastic Container Service), SNS (Simple Notification Service), SQS (Simple Queue Service), Amazon Redshift, AWS Glue. Data Warehousing/BI Tools: Informatica, Snowflake, Tableau, Power BI Operating Systems: MacOS, Windows, Linux Kernels.

Programming languages: Python, Scala, PL/SQL

Databases: MySQL, PostgreSQL, AWS RedShift, Azure SQL DB, Snowflake, Oracle, MongoDB. Monitoring: AWS CloudWatch, Splunk, Crontab, Apache Airflow, Kafka. Data Analysis: Databricks, Jupiter Notebooks, Microsoft Excel. Container and CI/CD: Docker, Jenkins.

Project Management: Radar, JIRA, Confluence

Version Control: Git, GitHub, GitLab, Bitbucket.

PROFESSIONAL EXPERIENCE:

Apple Inc., Austin, TX May 2022-Present

Spark & Big Data Developer

• Implemented complex data transformations and processing logic using Apache Spark and Scala, ensuring efficient data processing and analysis.

• Collaborated with cross-functional teams to design, develop, and maintain scalable and high-performance big data pipelines, enabling data-driven decision-making processes.

• Utilized Hadoop ecosystem components, such as HDFS, YARN, and Hive, to store and manage vast amounts of structured and unstructured data.

• Designed and optimized data ingestion processes, enabling real-time and batch data loading from various sources into Hadoop clusters.

• Managed and executed data warehouse plans, Implemented Snowflake for robust data warehousing, ensuring secure and scalable storage and retrieval of structured and semi-structured data, optimizing analytical processes.

• Leveraged HBase as a NoSQL database for storing and retrieving large-scale, semi-structured data, ensuring high availability and low-latency access.

• Developed custom Spark Scala applications to process and analyze large datasets, improving data quality and enhancing business intelligence capabilities.

• Designed and implemented Scala jobs, incorporating Spark queries and data modeling techniques for enhanced data processing and analysis.

• Employed Tableau for creating interactive and insightful data visualizations and dashboards, enabling stakeholders to gain valuable insight.

• Continuously monitored and optimized data processing performance, improving job execution times and resource utilization.

• Implemented version control and code management using GitHub, facilitating collaboration and code review processes.

• Extracted and transformed data from Oracle databases, ensuring data accuracy and integrity, and integrated it into data processing workflows, contributing to effective data analysis, and reporting.

• Utilized Apache Airflow for orchestrating and scheduling data workflows, ensuring data pipelines run efficiently and reliably.

• Configured and managed Crontab schedules for periodic batch job execution and data processing tasks.

• Implemented RADAR as the Agile methodology for project management, ensuring streamlined workflows, efficient collaboration, and effective project tracking, resulting in improved project outcomes and team productivity. Wayfair, Boston, MA Feb 2021-Apr 2022

Big Data Engineer

• Designed aggregate data models for complex data ingestion, consolidating multiple streams of credit card data.

• Orchestrated data ingestion using Apache Airflow and Directed Acyclic Graphs (DAGs).

• Developed PySpark code to process raw data from various sources into S3 buckets.

• Implemented CI/CD pipelines using Jenkins and wrote unit tests.

• Designed, developed, and deployed data pipelines using AWS services (S3, Redshift, ECS, EMR, Glue) and integrated real-time monitoring with AWS CloudWatch and CloudWatch Alarms. Fidelity Investments, Boston, MA Jun 2020-Feb 2021 Big Data Engineer

• Extracted and ingested data from Teradata to Azure Blob storage using Azure Data Factory.

• Managed Databricks clusters and set up secure Key Vaults.

• Designed and processed datasets, handling data transformations.

• Extracted data from various sources and developed Logic Apps for custom transformations.

• Created Azure Databricks notebooks for data analysis and reporting.

• Built complex stored procedures in Azure SQL DWH and led legacy application migrations to Azure.

• Participated in testing, defect resolution, and code review meetings. Anthem, Mumbai, India

Data Engineer Jun 2017-May 2019

• Migrated legacy Hadoop MapReduce and HDFS-based data processing to a cloud-native AWS solution.

• Configured Hadoop nodes and Utilized Kafka to architect and build high-performance streaming data pipelines, enabling real-time data ingestion and processing for mission-critical business applications.

• Improved data organization and query optimization with Apache Hive partitioning and bucketing.

• Transitioned Hadoop MapReduce and HDFS to real-time processing using AWS S3, Lambda and SQS.

• Managed access with AWS IAM, SQS policies and performed real-time data processing and migration. VOLUNTEERING:

BAPS Swaminarayan Organization, India & USA Jan 2007-Present

• Engaged in community outreach, including food drives, health camps, and anti-addiction campaigns.

• Supported cultural and educational initiatives to preserve Indian traditions.

• Assisted in event planning and management and participated in disaster relief and fundraising efforts.

Contact this candidate