Models

The model is the reasoning engine behind every agent. Choosing the right model for each task — and knowing when to switch — is one of the highest-leverage decisions in an AI system.

The Model Landscape

Three categories, each with distinct trade-offs.

Frontier Models

The most capable commercial models — Claude, GPT-4, Gemini. Best reasoning, longest context, most expensive. Accessed via API, so data leaves your infrastructure.

  • Highest capability
  • Always up to date
  • Pay-per-token pricing

Open-Source Models

Llama, Mistral, Qwen, DeepSeek — freely available, self-hostable, increasingly competitive. Full control over deployment, no per-token costs after infra investment.

  • Full data control
  • No vendor lock-in
  • Requires GPU infrastructure

Fine-Tuned Models

Base models trained further on your specific data and tasks. Higher accuracy for domain-specific work, smaller models that punch above their weight.

  • Domain-specific accuracy
  • Smaller, faster, cheaper
  • Requires training data & expertise

Value Pathways

Strategic value from understanding the model layer.

Task-Specific Routing

Not every task needs a frontier model. Route simple extraction to a fast model, complex reasoning to a capable one, and domain-specific tasks to a fine-tuned specialist. Same quality, fraction of the cost.

Fine-Tuning

When a general model is almost good enough but not quite, fine-tuning bridges the gap. Train a smaller model on your specific task and it outperforms a general model twice its size — at a fraction of the cost.

Cost Structure

Understanding the model layer lets you design cost-predictable systems. Fixed-cost self-hosted models for high-volume work, pay-per-token APIs for occasional complex tasks.

Vendor Independence

Building on open-source models and abstracted inference layers means you're never locked into a single provider. If pricing changes or a better model appears, you switch without rebuilding.

Need help choosing the right models?

I evaluate models against your specific requirements — not benchmarks — and design systems that can adapt as the landscape evolves.