Job Description
Requirements:
• Design and optimize Data Pipelines using Spark, Hudi, EMR cloud services, and Kubernetes containers
• Make sure pedigree and provenance of the data is maintained such that the access to data is protected
• Clean and preprocess data to enable analytic access
• Collaborate with enterprise working groups to advance the state of data standards
• Collaborate with the engineering team, data stewards, and mission partners to aid in
getting actionable value out of the data holdings architects complex, repeatable ETL
processes
• Provide Advanced Database Administration support in Oracle, MySQL, MariaDB, MongoDB,
Elastic and others
• Supports Experience with Targeting using Sponsor Tools, Reverse Engineering
• Develop API connectors to enable ingest of new data catalog entries from databases and files
• Ensure that data mappings will provide the best performance for expected user experience
Full-time