Senior Data Engineer

Location:

Los Angeles, CA

Salary:

130000

Posted:

October 07, 2025

Contact this candidate

Resume:

Results-driven Senior Data Solution Architect with 10+ years of

experience designing and optimizing scalable data platforms, ETL pipelines, and data warehouses. Proven expertise in big data technologies including Hadoop, Spark, Kafka, and real-time stream processing. Skilled in building cloud-native architectures on AWS, Azure, and GCP, integrating data solutions with tools like Talend, Apache NiFi, Informatica, and Airflow.

Experienced in deploying machine learning models using TensorFlow, Scikit-learn, and PyTorch within Databricks and MLOps frameworks. Strong background in data governance, quality management, and regulatory compliance (HIPAA, GDPR). Adept at translating business requirements into actionable insights using Tableau, Power BI, and QuickSight. Known for delivering secure, scalable, and high-performance solutions across finance, healthcare, and enterprise domains.

Professional Experience

Senior Data Engineer

CoreLogic

2021-07 -

Current

Designed enterprise-wide data pipelines using

Spark, Airflow, and Python, processing 10+ TB of

data daily.

•

Migrated legacy data warehouse into Snowflake,

reducing infrastructure costs by 25% and

improving performance by 45%.

•

Implemented real-time event-driven pipelines

using Kafka + Flink to support real-time decision- making.

•

Led a team of 6 engineers, establishing best

practices for CI/CD, testing, and monitoring of

data pipelines.

•

Defined data governance policies (Apache Atlas

+ Collibra) to ensure compliance with GDPR and

CCPA.

•

• Data pipeline design & optimization: Builds and

Contact

Address

Los Angeles, CA 90001

Phone

415-***-****

E-mail

************@*****.***

Skills

• Code automation

Python automation and

API integration

•

Data integration

expertise

•

Data quality

management

•

Data warehousing

expertise

•

Proficient in Azure data

solutions

•

Apache Kafka, Amazon

Kinesis, Apache Flink,

and Google Pub/Sub.

•

• CI/CD & DevOps

Tableau, Power BI,

Looker, and Apache

Superset.

•

• Python programming

Senior Data Engineer Big Data Engineer Data Warehouse Engineer

Rana Jalil

maintains ETL/ELT pipelines to move data from

multiple sources into warehouses, lakes, or

analytics platforms.

Architecture & scalability: Designs systems that

can handle growing data volumes efficiently.

•

Cloud & big data tools: Works with technologies

like AWS, Azure, GCP, Spark, Kafka, Hadoop,

Databricks, Snowflake, etc.

•

Data modeling: Structures data for usability,

performance, and accuracy.

•

Collaboration: Works with data scientists, analysts, and business stakeholders to understand needs

and deliver solutions.

•

Mentorship: Provides guidance to junior engineers

and ensures best practices in coding, security,

and performance.

•

Big Data Engineer

Effectual

2017-06 -

2021-06

Designed and implemented real-time data

pipelines using Kafka, Spark Streaming, and AWS

Kinesis to process over 5TB+ of daily data.

•

Migrated large on-premise Hadoop clusters to

AWS EMR and Google Dataproc, reducing

infrastructure costs by 30%.

•

Built data lake architecture on AWS S3 with

metadata management using Glue Catalog and

Apache Hive.

•

Partnered with machine learning teams to deliver

feature engineering pipelines using PySpark and

Delta Lake.

•

Optimized Hive and Spark jobs, reducing query

runtimes by 40% through partitioning, bucketing,

and caching strategies.

•

Established data governance, lineage tracking,

and security policies with Apache Atlas and AWS

Lake Formation.

•

Data ingestion: Builds systems to capture and

integrate data from multiple, high-speed sources.

•

Storage management: Designs and manages

large-scale data storage solutions such as

Hadoop HDFS, Amazon S3, or distributed file

systems.

•

Processing frameworks: Works with big data tools

like Apache Spark, Hadoop, Flink, Hive, or Kafka

•

• Data pipeline design

• ETL development

• Advanced SQL

Projects

Healthcare Predictive

Analytics Platform:

Python, Apache Spark,

AWS S3, Redshift,

Scikit-learn

•

Real-time Customer

Behavior Tracking:

Apache Kafka, Spark

Streaming, AWS Kinesis,

PostgreSQL, End-to-End

Data

•

Financial Data:

Snowflake, AWS Glue,

Apache Airflow, Python

•

Certificates

Databricks Certified

Associate Developer for

Apache Spark:

Informatica Cloud Data

Integration Specialist:

Microsoft Certified: Azure AI

Engineer Associate:

Education

2014-01

Bachelor of Science:

Computer Science

Punjab University

to process and transform huge datasets.

Data pipeline development: Creates scalable

ETL/ELT pipelines to move and prepare data for

analytics and machine learning.

•

Performance optimization: Ensures systems can

handle petabytes of data efficiently with high

availability and fault tolerance.

•

Collaboration: Partners with data scientists,

machine learning engineers, and analysts by

preparing data that supports advanced analytics

and AI.

•

Data Warehouse Engineer

Databridge Analytics

2014-07 -

2017-05

Designed and implemented enterprise data

warehouse in Snowflake, supporting 1,000+

business users.

•

Developed ETL pipelines with dbt + Airflow,

automating ingestion and transformation

workflows.

•

Optimized queries and models, improving

dashboard load times by 60%.

•

Collaborated with BI teams to deliver KPI

dashboards in Power BI and Tableau.

•

Defined data modeling standards (Kimball &

Data Vault) to ensure scalability and

maintainability.

•

Data integration: Designs ETL/ELT processes to pull data from diverse sources and load it into the

warehouse.

•

Data modeling: Structures the warehouse using

schemas (star, snowflake, or normalized) to make

querying efficient and intuitive.

•

Performance optimization: Ensures queries run

fast and the warehouse can scale as data

volume grows.

•

Data quality & consistency: Applies rules for

cleaning, validating, and standardizing data so

reports are accurate.

•

Tool expertise: Works with platforms like

Snowflake, Amazon Redshift, Google BigQuery,

Azure Synapse, or on-premise solutions like

Teradata and Oracle.

•

Collaboration: Partners with BI developers,

analysts, and data scientists to make sure the

•

warehouse meets business requirements.

Maintenance & governance: Ensures security,

compliance, backup, and monitoring of

warehouse systems.

•

Contact this candidate