CornerStone TTS is looking for a part time engineer to own the end-to-end design, implementation, and maintenance of mixed-media data pipelines. You will collaborate closely with an AI/ML team and client stakeholders to ingest, process, and manage image, video, and tabular datasets—ensuring production-ready systems that support both training and inference workloads.
Key Responsibilities
Pipeline Architecture & Ownership
Architect, build, and optimize ETL/ELT workflows to ingest .jpeg, .gif, .csv, and .mov assets.
Ensure pipelines are modular, version-controlled, and easily maintainable.
Storage & Database Management
Design and administer NAS/local storage solutions: setup, access control, backup, and performance tuning.
Deploy, tune, and maintain SQL (PostgreSQL/MySQL) and NoSQL (MongoDB/Cassandra) databases for high-volume ingestion and querying.
Media Processing & Transformation
Leverage Python with Pandas and NumPy for tabular data transformations.
Use OpenCV for image preprocessing (resizing, normalization, feature extraction).
Utilize FFmpeg for video ingestion, transcoding, and key-frame extraction.
Monitoring, Logging & Reliability
Implement comprehensive logging, monitoring, and alerting for all pipelines.
Proactively troubleshoot throughput or data-quality issues; optimize for reliability and scalability.
Collaboration & Documentation
Partner with ML engineers to integrate data pipelines into model training and serving workflows.
Produce clear, reproducible documentation (Jupyter notebooks, Markdown guides, Confluence pages) covering architecture, pipeline usage, and troubleshooting.
Required Qualifications
Experience: 5+ years in data engineering or similar roles, with a track record of owning production pipelines.
Databases: Expert in SQL database design/tuning (PostgreSQL/MySQL) and at least one NoSQL system (MongoDB, Cassandra).
Storage: Hands-on with NAS or equivalent local storage architectures and protocols.
Media Handling: Deep expertise processing .jpeg, .gif, .csv, and .mov files at scale.
Programming: Advanced Python scripting; proficient with Bash; comfortable on Ubuntu/Linux servers.
Libraries/Tools: Pandas, NumPy, OpenCV, FFmpeg.
Communication: Strong written and verbal skills; demonstrated ability to produce clear technical documentation.
Preferred Qualifications
Familiarity with AI/ML frameworks (PyTorch, TensorFlow, open-source GPT/RAG toolkits).
Experience in GPU-accelerated environments (NVIDIA stack).
Proficiency with Git/GitHub workflows and CI/CD pipelines.
Prior leadership or mentorship experience with junior engineers.
Background in oil & gas or other regulated industries is a plus, but not required.
What You’ll Gain
Ownership & Impact: Lead the data foundation for cutting-edge AI imaging and chat products.
Collaboration: Work side-by-side with leadership and AI/ML experts.
Flexibility: Part-time schedule tailored for high productivity and work-life balance.
Growth: Hands-on exposure to both data engineering and AI/ML integration in a production setting.