Case Study
AI Personal Assistant Platform
A production-grade AI assistant that integrates with core business systems to support decision-making, operations, and day-to-day execution.
The challenge
Many AI assistants and chatbots fail to deliver real value because they operate in isolation from the systems where work actually happens. They can answer questions, but they cannot take action. They lack context about the business, the user, and what matters.
The goal was to create something different: an AI assistant that genuinely supports business operations by integrating deeply with existing systems, understanding context, and helping with real tasks rather than just answering generic questions.
The approach
We designed and built a production-grade AI Personal Assistant from the ground up, focusing on practical utility over impressive demos.
Deep system integration: The assistant connects directly to core business platforms including CRM, email, calendars, internal databases, and operational systems. This gives it the context needed to provide genuinely useful support.
Structured workflows with guardrails: Rather than allowing open-ended AI behaviour, we implemented structured workflows for common tasks. The system knows what it can and cannot do, with clear boundaries and appropriate human oversight for sensitive operations.
Trusted assistant, not a chatbot: The design philosophy centres on being a trusted assistant rather than a generic chatbot. It understands business context, remembers relevant history, and can take meaningful action within defined parameters.
LLM integration done right: We use large language models for what they are good at, reasoning, summarisation, drafting, and insight generation, while building robust systems around them to handle reliability, cost, and auditability.
Technical considerations
Building AI systems that work reliably in production requires attention to aspects that demos typically ignore:
- Cost awareness: LLM calls are not free. The architecture balances capability with cost, using appropriate models for different tasks and caching where sensible.
- Auditability: Every action and decision can be traced. This matters for trust, debugging, and compliance.
- Maintainability: The system is designed for long-term operation, not just initial deployment. Clear architecture, good documentation, and straightforward update paths.
- Human oversight: For sensitive operations, the system requests confirmation or flags decisions for human review. Automation supports humans rather than replacing judgement.
Technical Implementation
The system leverages Anthropic's Claude API as the primary LLM layer, with careful architectural decisions to optimise for cost, reliability, and speed.
Multi-Model Architecture
Different tasks require different capabilities. We implemented a three-tier model routing system:
- Haiku layer: Fast classification, triage, and simple operations. Used for initial request parsing, email categorisation, and quick lookups. Response times under 500ms.
- Sonnet layer: Multi-step reasoning, response generation, and complex task execution. Handles the bulk of assistant interactions where quality and nuance matter.
- Opus layer: Strategic analysis and complex reasoning tasks. Reserved for situations requiring deep understanding or multi-document synthesis.
Tasks are scored 1-10 on complexity and routed to the appropriate tier automatically.
Batch API for Cost Efficiency
Email processing and case management run through Anthropic's Batch API, achieving 50% cost reduction compared to synchronous calls. The system batches non-urgent operations and processes them in optimised windows, making high-volume LLM processing commercially viable.
Context Management
Using Anthropic's Files API for context caching, the system maintains consistent context across sessions while optimising token usage. Company context, communication preferences, and historical patterns are cached and injected efficiently.
Operational Metrics
The production system processes 150+ emails daily with an operational cost of approximately $0.32/day, representing a 75% reduction from naive LLM implementation through careful architecture and API feature usage.
The outcome
The result is a production system in active daily use. It handles tasks that previously required manual effort or context-switching between multiple systems, while maintaining appropriate controls and transparency.
This project demonstrates our approach: taking an idea from strategy and architecture through to hands-on engineering and a working system that delivers ongoing value.
Interested in a similar solution?
We would be glad to discuss how AI-assisted systems could support your operations.