VAIBHAV DEORAJ
AWS Data Engineer
Toronto, Canada +1-647-***-**** **************@*****.*** https://www.linkedin.com/in/vaibhavdeoraj/
Professional Summary
Overall 8+ years of IT Experience and 5 Years of experience as an AWS data engineer in designing, building, and maintaining data pipelines on the AWS platform
Demonstrated expertise in AWS services, achieving efficiency in big data solutions using EC2, S3, Redshift, Glue, Athena, EMR, and Lambda
Proficient in Enhancing data processing and analysis tasks by using advanced Python, Scala, and SQL programming skills
Orchestrated complex workflows with Apache Airflow and utilized Apache Spark, resulting increase in distributed data processing efficiency
Successfully architected data lakes and warehouses, improving data retrieval and analysis capabilities thereby enhancing business decision-making processes
Skilled in Implementing NoSQL solutions using MongoDB, achieving a 80% improvement in data storage and retrieval efficiency for diverse applications
Orchestrated ETL processes using AWS Glue, enhancing data visualization efficiency with Amazon Quick Sight and leveraging Amazon S3 for scalable data storage
Adept in utilizing Snowflake for data warehousing, resulting in improvement in data integration and analytics processes
Designed CI/CD pipelines, automating the software development lifecycle and improving code quality and release cycles
Collaborator with excellent people skills, initiative-taking, dedication, and understanding of the demands of 24/7 system maintenance and has good customer support experience
Worked in an Agile environment and have good insight into Agile methodologies and Lean working techniques, Participated in Agile ceremonies, and Scrum Meetings
Work Experience
AWS Data Integration Engineer
RBC, Toronto, Canada
02/2022 to Current
Designed and implemented server-less microservices using AWS Lambda, enabling efficient and cost-effective event-driven architectures for real-time data processing and automation within a cloud-native environment resulting a reduction in infrastructure costs compared to traditional server-based approaches
Utilized AWS Glue for automated data discovery, cataloging, and ETL (Extract, Transform, Load) processes, enhancing data quality and accessibility while reducing manual effort, resulting a reduction in data processing time, leading to streamlined data pipelines for analytics and reporting
Developed interactive and insightful data visualizations and dashboards using Amazon Quick Sight, facilitating data-driven decision-making, and providing actionable insights to stakeholders and business users resulting an increase in data-driven decision-making efficiency within the organization
Utilized SQL and Python extensively to streamline data workflows, perform data extraction, transform, and load (ETL) tasks, and generate actionable insights. This included database querying, data cleansing, and automation, resulting a reduction in data processing time, an improvement in data accuracy, and an increase in operational efficiency
Demonstrated expertise in PySpark for scalable data processing and Amazon Athena for efficient data querying in Amazon S3. Developed ETL pipelines and executed complex data analysis tasks
Utilized Amazon CloudWatch for comprehensive monitoring and observability of AWS resources, including setting up custom metrics, creating dashboards, configuring alarms, and analyzing logs to ensure optimal system performance and reliability leading to an improvement in system uptime and reliability through proactive monitoring and timely issue resolution
Designed and implemented a robust data lake and warehouse architecture, integrating diverse data sources and formats, enabling efficient data storage, retrieval, and analysis. This initiative resulted in improved data accessibility, reduced data silos, and enhanced reporting capabilities, driving data-driven decision-making within the organization
Designed and deployed Tableau and Power BI to create visually compelling and interactive data dashboards, enabling stakeholders to gain actionable insights from complex datasets resulting in improved business strategies and operational efficiencies
Implemented robust Continuous Integration and Continuous Deployment (CI/CD) pipelines for data engineering workflows, automating data pipeline deployments, version control, and ensuring data integrity. This streamlined the development and deployment process, reducing errors and enabling rapid iteration of data solutions
Managed data ETL workflows using AWS Step Functions and AWS Glue, enabling seamless data extraction, transformation, and loading processes, while ensuring data quality and reliability in large-scale data pipelines
AWS Data Platform Engineer
Osair Technologies, Hyderabad, India
01/2018 to 08/2021
Designed and implemented complex data pipelines on AWS, utilizing services like AWS Glue, AWS Data Pipeline, and AWS Lambda for seamless data extraction, transformation, and loading (ETL) processes
Managed and optimized data warehousing solutions using AWS Red Shift, creating efficient data models, and optimizing query performance for analytical purposes
Implemented real-time data processing solutions with Amazon Kinesis, enabling the capture and analysis of streaming data for immediate insights and actions resulting a reduction in time-to-insight compared to traditional batch processing methods
Worked with big data technologies such as Apache Spark and Hadoop within the AWS ecosystem, harnessing their capabilities for distributed data processing and analysis
Designed and maintained data lakes on AWS S3, ensuring data accessibility, security, and scalability for diverse analytics and reporting needs, enabling the storage of terabytes of structured and unstructured data
Developed and automated data workflows using AWS Step Functions and Apache Airflow, resulting in manual intervention, a decrease in processing time, and an increase in overall operational efficiency
Ensured data security and compliance with industry standards by implementing AWS Identity and Access Management (IAM) policies and encryption mechanisms
Established data governance best practices, including data cataloging and metadata management, to improve data discoverability and lineage tracking These initiatives led to a profound enhancement in data discoverability, empowering teams to access and leverage critical data assets with ease
Collaborated with cross-functional teams, including data scientists, analysts, and business stakeholders, to understand data requirements and deliver actionable insights
Maintained detailed documentation of data architecture, pipeline configurations, and best practices to facilitate knowledge sharing and onboarding of team members
Monitored AWS resources using Cloud Watch and implemented cost optimization strategies to maximize efficiency and reduce operational costs continuously tracking performance metrics, and promptly identifying areas of optimization
Network Engineer
Bhagwant University, Rajasthan, India
07/2012 to 10/2015
Designed, implemented, and maintained network infrastructure, including LAN/WAN configurations, VPNs, and routing protocols, to optimize data flow and ensure uninterrupted connectivity
Demonstrated strong troubleshooting skills by swiftly identifying and resolving network issues, minimizing downtime, and ensuring seamless network operations
Implemented robust security measures, including firewalls, intrusion detection systems (IDS), and access controls, to protect the network from unauthorized access and cyber threats
Collaborated with vendors and suppliers to procure network equipment, negotiate contracts, and manage vendor relationships, ensuring cost-effective solutions and timely hardware/software upgrades
Utilized monitoring tools such as SNMP, NetFlow, and packet analysis to proactively monitor network performance, identify bottlenecks, and optimize network resources
Maintained comprehensive network documentation, including diagrams, configurations, and change management logs, to ensure transparency and compliance with industry standards
Led network upgrade projects, including hardware refreshes and capacity planning, to accommodate business growth and ensure scalability
Implemented disaster recovery and redundancy solutions, including failover configurations and backup systems, to enhance network resilience and minimize data loss
Skills
Programming Languages
Python, SQL
AWS Services
S3, EC2,Lambda,Glue,EMR Red Shift
Database Technologies
My SQL, Postgre SQL
Data modeling Tools
ER/Studio, Visio
Data Visualization Tools
Tableau, Power BI
Data Integration
AWS Step Functions
Data Streaming
Kinesis
No SQL Database
Dynamo DB
Data Analytics
Athena
Monitoring
Amazon Cloud Watch
Education
PG Diploma in Business Management 11/2016
UUNZ Auckland, New Zealand
B Tech: Information Technology 06/2012
Bhagwant University Rajasthan, India