Model Registry
Platform models and bring-your-own-key (BYOK) custom model configuration.
The Model Registry manages the LLM models available in your organization. Pipelines ships a set of platform models (served through our OpenRouter integration — no keys to manage), and lets your org add custom models with its own credentials.
Open Models in the left sidebar to see every model available to your org.
Platform models
| Display name | Model ID |
|---|---|
| GPT 5.5 | openai/gpt-5.5 |
| GPT 5.4 | openai/gpt-5.4 |
| GPT 5.4 Mini | openai/gpt-5.4-mini |
| Claude Opus 4.8 | anthropic/claude-opus-4.8 |
| Claude Sonnet 4.6 | anthropic/claude-sonnet-4.6 |
| Gemini 3.1 Pro | google/gemini-3.1-pro-preview |
| Gemini 3.5 Flash | google/gemini-3.5-flash |
| Gemini 3.1 Flash Image | google/gemini-3.1-flash-image-preview |
| Grok 4.3 | x-ai/grok-4.3 |
| Grok 4.1 Fast | x-ai/grok-4.1-fast |
The default model is Gemini 3.1 Pro. The catalog is updated as providers release new models.
Custom models (BYOK)
Custom models call a provider directly with your own credentials.
Supported providers
| Provider | What you enter |
|---|---|
| OpenAI | API key + provider model ID (e.g. gpt-4o) |
| Anthropic | API key + provider model ID (e.g. claude-sonnet-4-20250514) |
API key + provider model ID (e.g. gemini-2.5-flash) | |
| OpenRouter | API key (sk-or-v1-...) + provider model ID in OpenRouter slug format (e.g. anthropic/claude-opus-4.7). Recommended over OpenAI-compatible for OpenRouter — uses LiteLLM's native OpenRouter integration, which correctly aggregates streaming tool calls. |
| Fireworks | API key + provider model ID (e.g. accounts/acme/models/llama-ft-v2) |
| Together AI | API key + provider model ID (e.g. meta-llama/Meta-Llama-3.1-70B-Instruct-Turbo) |
| Bedrock | AWS region + either an IAM key pair (Access Key ID + Secret Access Key) or a Bedrock API key; provider model ID (e.g. anthropic.claude-3-5-sonnet-20240620-v1:0) |
| HuggingFace | API token + Inference Endpoint URL + provider model ID |
| OpenAI-compatible | Base URL (required) + optional API key + provider model ID. Use this for any OpenAI-style endpoint — self-hosted vLLM, Ollama, or a hosted OpenAI/Anthropic-compatible gateway. |
Adding a custom model
The Add Custom Model dialog collects:
- Provider — the form adapts to the provider you pick.
- Credentials — the fields required by the selected provider (see above).
- Provider Model ID — the ID the upstream provider expects.
- Display Name — the Model Slug is auto-derived from this and is what workflow configurations reference.
- Max Tokens and the capability switches (see below), matching what the underlying model supports.
- Input / Output cost per token (optional) — used for cost tracking. If left blank, Test Connection may suggest values based on known pricing.
Test Connection makes a small live call with the entered credentials; saving is gated on it succeeding. If another model already uses the same credentials, the dialog offers Reuse saved credentials.
Custom model capabilities
Capability switches drive what the model is offered for in the Pipeline Builder and how Pipelines forwards requests. Set them based on what the upstream provider documents — incorrect flags can cause runtime errors or silently ignored parameters.
| Capability | Meaning |
|---|---|
| JSON mode | Provider natively supports structured JSON responses. |
| Vision | Model can process image inputs. |
| Tool use | Model supports function / tool calling. |
| Extended reasoning | Model exposes a thinking / reasoning step. When enabled, also pick the reasoning parameter shape — effort (provider accepts reasoning.effort) or otherwise reasoning.max_tokens. Use effort only if the provider documents an effort-style control. |
Choosing Max Tokens for tool-use workloads
max_tokens caps the entire model response, including any tool_calls blocks. When a model is wired to file tools (or any other multi-call tool set) and emits multiple parallel calls in one round — common with Anthropic Claude generating an HTML / CSS / JS site in three write_file calls — the cumulative tool-call payload competes for the same budget as ordinary text output.
If max_tokens is too low, the response is truncated mid-argument and the partial JSON is rejected by the tool with a misleading validation error (e.g. PATH_INVALID). Symptoms include the first tool call succeeding while later parallel calls in the same round fail.
Recommended starting points:
- General chat —
4096is fine. - Tool-use + interactive prompts (e.g. file generation) — at least
16000; bump to32000for site/multi-file generation. - Reasoning-capable models — match the upstream provider's documented output ceiling (e.g. 32K for Claude Opus 4.7, 100K+ for GPT-5 reasoning).
Worker logs surface a TOOL-LOOP round N: model=… hit max_tokens (finish_reason=length) warning when a truncation is detected, naming the affected tool calls — increase max_tokens on the model record (Models → Edit) when this fires.
Credential security
API credentials are encrypted at rest, never returned in API responses (the UI only sees a masked preview), and are never written to logs.
Model selection in pipelines
In the Pipeline Builder, the model dropdown groups models under Platform and Custom. Each model's capability flags gate features:
- Tool-calling controls are disabled unless the selected model has Tool use on.
- Extended reasoning controls only appear for reasoning-capable models.
Managing custom models
Open a model from the registry for its detail page:
- Edit — update display name, provider model ID, credentials, endpoint URL, max tokens, capabilities, reasoning parameter, and cost overrides. Unlike the Add dialog, Save is not gated on Test Connection — run it yourself after changing credentials or the provider model ID. Leaving the credential field blank during a test reuses the stored credential.
- Delete — deactivates the model. If active workflows still reference it, you'll be warned with a count; those workflows fail to generate LLM responses until repointed.