AI Architect/Insurance/12 months (Extendable/Convert to Perm)
Argyll Scott ·www.argyllscott.com
Apply directJob Description: Technical AI Architect
Role Overview
We are seeking a Technical AI Architect to lead the design, scaling, and governance of our Enterprise Agentic RAG platform. You will move beyond basic semantic search to architect production-grade, end-to-end multi-agent products and high-performance retrieval systems.
This role demands deep technical mastery in Agentic RAG and LangGraph, strict attention to cost/token optimization, and the ability to ship resilient, production-grade products that enforce robust enterprise guardrails and security compliance.
Key Responsibilities
-
Production-Grade Agentic Architecture: Design and build end-to-end Agentic RAG products utilizing state-driven, multi-agent systems and cyclic workflows via LangGraph. Move from sequential pipelines to iterative, self-correcting reasoning loops (e.g., query decomposition, self-reflection, and dynamic context validation).
-
Enterprise-Scale Retrieval Systems: Architect high-precision, layout-aware semantic chunking pipelines. Implement enterprise hybrid search (combining dense vectors, sparse BM25 keyword matching, and Reciprocal Rank Fusion) backed by two-stage cross-encoder reranking layers.
-
Cost & Token Optimization: Drive LLM unit economics at scale. Implement advanced strategies for token optimization, context-window compression, semantic caching, and dynamic cost-based model routing (e.g., routing lookups to lightweight models and deep reasoning to frontier models).
-
AI Governance, Security & Guardrails: Deploy production-ready enterprise safety nets. Enforce secure tool execution environments, Source Access Control Lists (ACLs), data privacy/PII redacting, and automated LLM-as-a-judge evaluation frameworks (e.g., Ragas, TruLens) tracking Faithfulness, context precision, and latency SLAs.
-
Technical Leadership & DevOps: Lead, mentor, and establish best practices for a dedicated team of AI/ML engineers. Oversee containerization (Docker, Kubernetes) and inference server optimization (e.g., vLLM, PagedAttention) to achieve low-latency SLAs.
Technical Stack & Requirements
-
Orchestration & Agents: Expert-level mastery of LangGraph (critical), LangChain, or LlamaIndex for state tracking and tool use.
-
Data & Vector Infrastructure: Deep experience with enterprise vector databases (Pinecone, Milvus, Qdrant, pgvector) and robust extraction pipelines for complex enterprise documents (PDFs, financial tables).
-
Models & Deployment: Hands-on experience with commercial APIs (OpenAI, Anthropic) and deploying, fine-tuning, or quantization of open-source models (Llama, Mistral) via production engines like vLLM.
-
Core Engineering: Strong Python foundation, asynchronous programming, microservices (FastAPI), and observability infrastructure (LangSmith, Weights & Biases).
-
Experience: 10 years of software/data experience, minimum of 3+ years in AI enterprise architecture with a proven track record of shipping end-to-end, production-ready enterprise GenAI products.
Argyll Scott Asia is acting as an Employment Agency in relation to this vacancy.