Post Job Free
Sign in

Data Engineer (Python, Spark, AWS - AI Exposure)

Company:
Wise Skulls
Location:
Clinton Township, OH, 43224
Posted:
February 12, 2026
Apply

Description:

Title: Data Engineer (Python, Spark, AWS - AI Exposure)

Location: Columbus, OH - Hybrid (3 Days Onsite / 2 Days Remote)

Duration: 6+ months (possibility of an extension)

Implementation Partner: Tekwings

End Client: To be disclosed

JD:

We are seeking a Data Engineer with strong expertise in Python, Apache Spark, and AWS, along with exposure to AI/ML data pipelines, to support scalable data processing and analytics initiatives.

The ideal candidate will work closely with data scientists, AI engineers, and business stakeholders to design, build, and optimize high-performance data pipelines that enable analytics and AI-driven use cases.

Key Responsibilities

Design, develop, and maintain scalable data pipelines using Python and Apache Spark

Build and optimize ETL/ELT workflows for structured and semi-structured data

Develop data processing jobs for batch and near real-time workloads

Integrate data from multiple sources including APIs, databases, and cloud storage

Support AI/ML workflows by preparing, transforming, and validating datasets

Collaborate with Data Scientists to enable feature engineering and model training pipelines

Implement data quality checks, validation rules, and monitoring processes

Optimize Spark jobs for performance, scalability, and cost efficiency

Deploy and manage data solutions on AWS cloud infrastructure

Participate in code reviews, documentation, and best engineering practices

Work in an Agile environment and support production data issues as needed Required Skills & Experience

Strong experience with Python for data engineering

Hands-on experience with Apache Spark (PySpark)

Solid experience working with AWS services, including:

S3

EC2

Glue

Lambda

EMR (preferred)

Experience with SQL and relational databases

Strong understanding of data modeling, data warehousing, and analytics concepts

Experience building and maintaining large-scale data pipelines

Familiarity with CI/CD and version control (Git) AI / ML Exposure (Preferred)

Exposure to AI/ML data pipelines and workflows

Experience supporting feature engineering for ML models

Understanding of how data is prepared for model training and inference

Familiarity with ML tools or frameworks is a plus (not mandatory) Nice-to-Have Skills

Experience with streaming technologies (Kafka, Spark Streaming)

Experience with Airflow or similar orchestration tools

Knowledge of data lakes and lakehouse architectures

Exposure to Docker or containerized environments

Experience working in regulated or enterprise-scale environments Ideal Candidate Profile

Strong problem-solving and analytical skills

Comfortable working in a hybrid onsite environment

Able to collaborate effectively with cross-functional teams

Proactive, detail-oriented, and delivery-focused

Clear communicator with strong documentation skills

Apply