Job Title: Palantir Data Engineer (On-Site, Irvine, CA)
Reports To: Principal Engineer Manager R&D
Location: Irvine, CA, On-Site
Department: Engineering and Product Development
Salary: $100,000 to $130,000 a year
Shift: Day, Monday to Friday, 8 AM to 5 PM
SUMMARY
Data engineers work closely with Subject Matter Experts (SMEs) to design the ontology (data model), develop data pipelines, and integrate Foundry with external systems containing the data. Data engineers also need to provide guidance and support on how to access and leverage the data foundation to create new workflows or analyze data.
ESSENTIAL DUTIES AND RESPONSIBILITIES:
• Integrate new data sources to Foundry using Data Connection
• Implement 2-way integrations between Foundry and external systems
• Develop pipelines transforming tabular or unstructured data
• Implement data transformations in PySpark or Pipeline Builder
to derive new datasets or create ontology objects.
• Set up support structures for pipelines running in production
• Monitor and debug critical issues such as data staleness or data quality
• Improve performance of data pipelines (latency, resource usage)
• Design and implement an ontology based on business requirements and available data
• Provide data engineering context for application development
• Identify opportunities for turning exploratory or analytical
applications into interactive operational workflows to drive business value.
• Maintain applications as usage grows and requirements change
PREFERRED QUALIFICATIONS:
• Between 1 and 3 years of experience, ideally in a customer-facing role
• Experience in Python/PySpark, or experienced in another programming
language and willing to learn Python and PySpark on their own.
• Experience in TypeScript, or experienced in another
programming language and willing to learn TypeScript on their own.
• Data engineering experience preferred over data science
• Programming experience requiring collaborative software development
• Python – complete language proficiency
SQL – proficiency in querying language (join types, filtering,
aggregation) and data modeling (relationship types, constraints)
• PySpark – basic familiarity (Data Frame operations, PySpark SQL
functions) and differences with other Data Frame implementations
(Pandas)
• Typescript – experience in TypeScript or experienced in another
programming language and willing to learn TypeScript on their own.
• Distributed compute – conceptual knowledge of Hadoop and Spark (driver,
executors, partitions)
• Databases – general familiarity with common relational database models
and proprietary instantiations, such as SAP, Salesforce, etc.
• Git – knowledge of version control/collaboration workflows and
best practices
• Iterative working – familiarity with an agile and iterative working methodology
and rapid user feedback gathering concepts.
• UX design – knowledge of best practices and applications
• Data quality – best practices
• Data literacy – data analysis and statistical basics to ensure correctness in
data aggregation and visualization
Benefits for all full-time employees include:
Medical (HMO/PPO Plan Options)
Dental
Vision
Group Term Life Insurance (CTC pays 100% of the premium)
Short-Term Disability and Long-Term Disability (CTC pays 100% of the premium)
Flexible Spending Account
401K
15 paid vacation days (more after 5 years)
9 paid holidays
3 paid sick leave days