Skip to main content
NOMOS supports multiple LLM providers, allowing you to choose the best model for your use case.

Supported Providers

OpenAI

GPT-4o, GPT-4o-mini, and more

Anthropic

Claude 3.5 Sonnet, Haiku, and Opus

Google Gemini

Gemini 2.0 Flash, Pro, and more

Mistral AI

Mistral Large, Medium, and Small

Ollama

Local models including Llama, Qwen, and more

HuggingFace

Open source models via HuggingFace

Cohere

Command R+, Command R, and more

Custom

Use the BaseLLM class to implement your own provider

More Coming Soon

NOMOS is continuously expanding support for new LLM providers

OpenAI

from nomos.llms import OpenAI

llm = OpenAI(model="gpt-4o-mini")

Anthropic

from nomos.llms import Anthropic

llm = Anthropic(model="claude-3-5-sonnet-20241022")

Google Gemini

from nomos.llms import Gemini

llm = Gemini(model="gemini-2.0-flash-exp")

Mistral AI

from nomos.llms import Mistral

llm = Mistral(model="ministral-8b-latest")

Ollama (Local Models)

from nomos.llms import Ollama

llm = Ollama(model="llama3.3")

HuggingFace

from nomos.llms import HuggingFace

llm = HuggingFace(model="meta-llama/Meta-Llama-3-8B-Instruct")

Cohere

from nomos.llms import Cohere

llm = Cohere(model="command-a-03-2025")

YAML Configuration

You can specify LLM configuration in your YAML config file:
llm:
  provider: openai
  model: gpt-4o-mini

Advanced Configuration

Custom Parameters

You can pass additional parameters to LLM providers:
llm = OpenAI(
    model="gpt-4o-mini",
    temperature=0.7,
    max_tokens=1000,
    top_p=0.9
)

YAML Advanced Configuration

llm:
  provider: openai
  model: gpt-4o-mini
  temperature: 0.7
  max_tokens: 1000
  top_p: 0.9

Multiple LLMs

Multiple LLMs can be defined to use different models for specific purposes. For example, you might configure one model optimized for coding tasks and another for general conversation. This flexibility allows you to tailor the behavior of your application to different use cases, improving efficiency and user experience.
llm:
  global:
    provider: anthropic
    model: claude-3-5-sonnet-20241022
  coding:
    provider: anthropic
    model: claude-opus-4-20250514

Troubleshooting

Ensure environment variables are set correctly in your shell profile or .env.local file
Check that the model name is correct and available in your region
Implement retry logic or use different models with higher rate limits
Ensure Ollama is running (ollama serve) and the model is pulled (ollama pull model-name)

Error Handling

NOMOS includes built-in error handling and retry mechanisms:
name: my-agent
llm:
  provider: openai
  model: gpt-4o-mini
max_errors: 3  # Retry up to 3 times on LLM errors

Performance Tips

Choose the Right Model

Use smaller models for simple tasks to reduce latency and costs

Configure Temperature

Lower values (0.1-0.3) for consistent responses

Set Max Tokens

Limit response length to control costs and latency

Use Local Models

Ollama for development or when data privacy is important

Model Documentation

For the most up-to-date list of available models, refer to the official documentation:

Anthropic Claude Models

Official Claude models documentation

OpenAI Models

Complete OpenAI models reference

Google Gemini Models

Vertex AI Generative AI models

Mistral Models

Mistral AI models overview

Ollama Model Library

Browse available local models

HuggingFace Models

Explore HuggingFace model hub

Cohere Models

Cohere models documentation