Post Job Free
Sign in

Python Developer - AI Document Platform

Company:
Collins Consulting
Location:
North Chicago, IL, 60086
Posted:
April 25, 2026
Apply

Description:

Onsite required Tues/Wed/Thurs

We are looking for a Software Development Engineer to build and scale an AI-powered document parsing platform that extracts structured data from complex PDFs (pharmaceutical batch records, certificates, regulatory documents) using OCR, LLMs, and RAG. You will work across the full stack - backend AI pipelines, frontend chat interface, and cloud infrastructure.

Roles & Responsibilities

Design and develop production-grade RAG (Retrieval-Augmented Generation) pipelines for domain-specific document querying with hybrid search, reranking, and multi-agent answer synthesis

Build and optimize document processing pipelines using AWS Textract for OCR extraction from tables, handwritten content, and structured forms

Integrate and orchestrate multiple LLM models (Claude, Gemini) for intent classification, data extraction, validation, and conversational AI

Develop and maintain the FastAPI backend - REST APIs, streaming endpoints (SSE), authentication, and background task processing

Build responsive frontend features using Next.js, React, and TypeScript - chat interface, PDF viewer with highlights, real-time progress tracking

Manage cloud infrastructure on AWS - EC2 deployment, S3 storage, RDS (PostgreSQL), and IAM configuration

Work with vector databases (Weaviate) and graph databases (Neo4j) for semantic search and structural document querying

Implement chunking strategies, embedding generation, cross-encoder reranking, and semantic caching for accurate document retrieval

Deploy and monitor AI models and services in production - model fallback chains, retry mechanisms, error handling

Write clean, maintainable code with proper logging, error handling, and documentation Required Skills

Python (FastAPI, async programming, pandas)

TypeScript / React (Next.js)

RAG systems - vector search, embeddings, chunking, reranking (production-grade)

LLM integration - prompt engineering, structured output, multi-model orchestration

WS - EC2, S3, Textract, RDS

PostgreSQL

REST API design with streaming (SSE)

Git, basic CI/CD, Linux server management Good to Have

Weaviate, Neo4j, or similar vector/graph databases

Gemini Vision or GPT-4V for document image analysis

LangChain / LangGraph

Docke, nginx

Pharmaceutical/regulated document experience Experience:

3-6 years The benefits that you are eligible for with Collins Consulting, Inc:

401(k)

Medical, Dental and Vision Insurance

Term Life Insurance

ccidental Death and Dismemberment

Long Term Disability

Apply