Post Job Free
Sign in

Associate Developer Data

Location:
Hawthorne, CA
Salary:
50000
Posted:
December 31, 2022

Contact this candidate

Resume:

Laila Atkins

Technical Skills

Python

SQL

AWS Cloud Services

Apache Spark

Certifications

AWS Certified Cloud Practitioner

Databricks Certified Associate Developer for Apache Spark 3.0 Professional Experience

SkillStorm June 2022 – Dec 2022

Data Engineer

Bastion Analytics

A data analytics firm was tasked with providing an Extract, Transform, and Load (ETL) pipeline for the U.S. Customs and Border Protection.

Utilized the Python library Pandas to perform Exploratory Data Analysis of the dataset to determine an appropriate data model and data cleaning strategy.

Developed SQL and Python scripts to create partitioned and indexed tables in Postgres and automate loading the transformed data into the tables.

Estimated the cost of the proposed solution in AWS using S3, Lambda, and Redshift.

Determined an appropriate Data Lake structure with bronze, silver, and gold layers.

Submitted a Proof of Concept (PoC) for the architecture of their proposed solution for the data lake, data warehouse, and ETL pipeline.

Environmental Protection Agency

The EPA received grant funding to develop a robust data engineering solution to process and analyze global air quality measurements.

Developed AWS Lambda functions to load historical data into the raw layer of an S3 data lake.

Created Kinesis Data and Delivery Streams to ingest and format real-time data into the S3.

Configured an AWS Glue Data Catalogue on the S3 data lake to provide a centralized source of metadata for data governance and discovery.

Analyzed business and reporting requirements to create an optimized data model.

Integrated with Databricks and deployed a Spark cluster to transform and aggregate cleaned data into a snowflake schema for use in a data warehousing solution.

Designed, implemented, and tested the data warehousing solution on Amazon Redshift.

Orchestrated jobs across disparate systems using Apache Airflow.

Utilized Amazon QuickSight to develop a dashboard adhering to BI requirements. Sparrow Insights

Sparrow Insights was migrating their data infrastructure from on-premises to the AWS cloud. They specialize in analyzing version control data for software to provide insights into the habits of developers. They tasked the data engineering team to develop an ETL pipeline on Apache Spark on Databricks.

Consulted stakeholders to determine the data consumption patterns and reporting requirements.

Created and managed a data lake on AWS S3 using raw, conformance, and curated layers.

Analyzed business and data requirements to appropriately size and configure an Apache Spark cluster on the Databricks platform.

Developed multiple PySpark scripts for each stage of the ETL pipeline.

Utilized the Databricks Jobs API to manage orchestration of Spark applications.

Deconstructed Spark applications into jobs, stages, and tasks to identify bottlenecks and optimize performance.

Separated data into fact and dimension tables using a star schema to optimize OLAP workloads.

Aggregated data using business reporting requirements for use in visualization and BI tools. UCLA Student Accounts, Los Angeles, CA February 2019 – June 2022 Refund/AR Specialist Assistant

Overseer of Stale Dated Check Project

Work with BAR/OASIS software to gather financial information of student accounts such as financial transactions and refunds

Communicate with the Financial Aid Office, Accounts Payable Office, and Registrar’s Office in a professional manner

Convert financial “jargon” into everyday language for parents and students alike to understand

Understand the generation of 1098T tax forms and T1LLA Canadian Tax Forms

Work with over 300 accounts each week

Education

University of California, Los Angeles (UCLA), Los Angeles, CA Bachelor of Science in Financial Actuarial Mathematics Skilled in Statistics, Probability, Game Theory, Numerical Analysis, Network Theory, and Calculus



Contact this candidate