Models

The model is the reasoning engine behind every agent. Choosing the right model for each task — and knowing when to switch — is one of the highest-leverage decisions in an AI system.

The Model Landscape

Three categories, each with distinct trade-offs.

Frontier Models

The most capable commercial models — Claude, GPT-4, Gemini. Best reasoning, longest context, most expensive. Accessed via API, so data leaves your infrastructure.

Highest capability
Always up to date
Pay-per-token pricing

Open-Source Models

Llama, Mistral, Qwen, DeepSeek — freely available, self-hostable, increasingly competitive. Full control over deployment, no per-token costs after infra investment.

Full data control
No vendor lock-in
Requires GPU infrastructure

Fine-Tuned Models

Base models trained further on your specific data and tasks. Higher accuracy for domain-specific work, smaller models that punch above their weight.

Domain-specific accuracy
Smaller, faster, cheaper
Requires training data & expertise

Value Pathways

Strategic value from understanding the model layer.

Task-Specific Routing

Not every task needs a frontier model. Route simple extraction to a fast model, complex reasoning to a capable one, and domain-specific tasks to a fine-tuned specialist. Same quality, fraction of the cost.

Fine-Tuning

When a general model is almost good enough but not quite, fine-tuning bridges the gap. Train a smaller model on your specific task and it outperforms a general model twice its size — at a fraction of the cost.

Cost Structure

Understanding the model layer lets you design cost-predictable systems. Fixed-cost self-hosted models for high-volume work, pay-per-token APIs for occasional complex tasks.

Vendor Independence

Building on open-source models and abstracted inference layers means you're never locked into a single provider. If pricing changes or a better model appears, you switch without rebuilding.

Security Postures

How this works across different deployment models and security requirements.

SaaS / API Standard

Access frontier models via API — always up to date, zero maintenance. Best for general-purpose tasks where data sensitivity allows it.

Typical tools: Claude, GPT-4, Gemini, Command R+

Self-Hosted High

Deploy open-source models on your own infrastructure. Choose models optimised for your specific tasks and fine-tune on your data.

Typical tools: Llama, Mistral, Qwen, DeepSeek

Air-Gapped Maximum

Run models in fully isolated environments. No telemetry, no external dependencies. Required for classified workloads.

Typical tools: Quantized open-source models, GGUF format

Hybrid Configurable

Use frontier APIs for non-sensitive tasks, self-hosted models for proprietary data. Route based on classification rules.

Typical tools: Mixed: commercial APIs + local open-source

Need help choosing the right models?

I evaluate models against your specific requirements — not benchmarks — and design systems that can adapt as the landscape evolves.

Get in Touch See Services