Data Engineer Big

Location:

Plano, TX

Posted:

September 10, 2025

Contact this candidate

Resume:

Jerline George

Big Data Engineer / Data Engineer / Cloud Engineer Phone: +1-214-***-****

LinkedIn: https://www.linkedin.com/in/jerline-george

Authorized to work in the United States, no sponsorship required Email: *******.*@*****.***

Personal Profile

Dynamic and results-oriented Big Data Engineer / Data Engineer with over 6 years of experience in data engineering, analytics, and pipeline development in FinTech. Demonstrated expertise in leveraging Hadoop and AWS technologies to process and manage over 10 million data sets in fast-paced, Agile environments. A certified AWS Developer Associate with hands-on experience in implementing ETL pipelines, data migrations, and optimizing data solutions for the banking and financial services sector. Proven ability to collaborate effectively with global teams and maintain a high standard of project delivery, with less than 10% defects across large-scale projects. Known for strong problem-solving skills, attention to detail, and a commitment to improving data processing efficiency by up to 70%.

Key Skills

Big Data Technologies: Apache Hadoop, Apache Pig, Apache Hive, HBase, Sqoop, Apache Spark, Spark-SQL, Kudu, Kafka.

Programming Languages: Scala, Python, PySpark, Unix Shell Scripting.

Cloud Technologies: AWS (EC2, EBS, IAM, Athena, Glue, S3, Redshift), Azure (Data Lake Gen2, Databricks, Synapse Analytics).

ETL Tools: Talend, SnapLogic, Syncsort.

Data Integration: Developing data pipelines for millions of records, optimizing and processing large datasets.

Data Warehousing: AWS Redshift, Data Lake creation, Teradata.

Agile Methodologies: Agile, Scrum, Waterfall, Jira, Version One.

Automation & Performance Tuning: Automated processes using Unix Shell Scripting and optimized HQL queries for performance enhancement.

Team Collaboration: Experience working with global teams, mentoring offshore teams, and ensuring seamless project delivery with less than 10% defects.

Customer Data Platform (CDP) Integration: Streamlined data integration processes and improved data quality using CDP.

DevOps Tools: Git, Jenkins, Autosys, Control-M, Apache Airflow.

Data Migration: Migrated datasets using AWS Glue, S3, and Talend.

Analytical & Problem-Solving Skills: Proven ability to handle complex data and implement innovative solutions to optimize processes.

AWS Certified Developer Associate: Certified with hands-on experience in AWS tools and services.

Education and Training

April-2018

Ryerson University, Toronto

Workplace Communication Program for IT Professionals

April-2017

AWS Certified Developer Associate

April-2015

Sathyabama University, Chennai

Bachelor of Engineering (Electronic and Communication Engineering)

Employment History

Sickkids – Toronto,ON(Oct 2024- Dec 2024)

Data Engineer

Developed and maintained robust data pipelines using NiFi and Apache Airflow to automate the ingestion, transformation, and orchestration of large datasets across distributed systems.

Leveraged Apache Hadoop, Ozone, and CDP for scalable storage and processing, optimizing cluster performance and ensuring data reliability.

Built and optimized distributed data processing workflows with Apache Spark, achieving up to a 30% improvement in job execution times for ETL processes.

Implemented data analytics solutions in Python to extract actionable insights, supporting business decisions with advanced visualizations and statistical analyses using Power BI.

CitiBank – Mississauga, ON (Aug 2023 – Aug 2024)

Big Data Engineer Project: Olympus-Data

Developed Spark-Scala,python applications for onboarding reference data with Slowly Changing Dimension (SCD-2) logic.

Automated processes using Unix shell scripting.

Enhanced HQL query performance through optimization and tuning.

Built and maintained end-to-end big data pipelines using Spark, Hadoop, and Kafka, resulting in a 20% improvement in processing efficiency.

Integrated ETL pipelines using Talend to move data from S3 to Hive and Spark.

Created data pipelines from cloud to HDFS using the SF framework with Spark Scala.

Developed real-time applications for party and product tables using Kafka framework.

Managed ETL processes using Azure Data Factory for efficient data transformation and loading into Azure SQL Database.

Led the migration of on-premise databases to Azure SQL Database, achieving a 30% reduction in operational costs and improved scalability.

Tech Stack: Apache Hive, Spark, Scala, Impala, Autosys, Jenkins, Git, Talend, Apache Airflow, Python, Azure Data Factory, Snowflake

Interac Corp. – Toronto, ON (Mar 2019 – Dec 2022)

Data Integration Developer (Big Data & AWS) Project: Data Engineering

Developed aggregation Spark applications for Interac e-transfer using Scala and PySpark.

Integrated pipelines with Talend Studio, automating tasks through TAC Server.

Migrated data from S3 to Redshift using AWS Glue, boosting efficiency by 70%.

Designed Spark-SQL and ETL processes, handling large debit card transaction datasets.

Tech Stack: Scala, Apache Spark, Spark-SQL, Talend, S3, Glue, Redshift, Impala, Git, Jenkins, Airflow

Tech Mahindra – Brampton, ON (Nov 2018 – Feb 2019)

Technical Lead (Big Data Developer) Client: Rogers Communications

Developed Spark applications using Scala and Spark-SQL for audit frameworks in machine learning models.

Implemented Sqoop jobs to export data from Hive to Oracle for downstream processes.

Automated Spark environment setups with Python wrapper scripts.

Tech Stack: Apache Hive, Spark, Scala, Sqoop, Python, Control-M, Git

Wipro Technologies – Chennai, India (Feb 2016 – Oct 2017)

Big Data Developer Client: Capital One

Optimized processing of raw data using Pig scripting and ETL tools, reducing defects by 60%.

Built Spark applications for faster processing of account management data, achieving speeds 100x faster than MapReduce.

Managed large datasets using AWS services like S3, EC2, and Redshift.

Tech Stack: Pig, Hive, Kafka, Spark, Spark-SQL, AWS (S3, EC2, Redshift), Git

Wipro Technologies – Chennai, India (Sep 2015 – Feb 2016)

Big Data Developer Project: BFSI-Americas-Capital-One

Developed ETL pipelines using Cascading Ecosystem, optimizing data processing efficiency.

Participated in Agile ceremonies and implemented Hive optimizations, increasing query performance by 50%.

Created secure S3 buckets and IAM roles for 100K datasets, improving security and data access.

Tech Stack: AWS S3, EC2, IAM, Teradata, HBase, Apache Pig, Syncsort

Contact this candidate