Inference How-Tos

Generated on 3 Jul 2026

Inference provides a single control plane for managing inference workflows. It includes a Model Catalog where you can view available foundation models, including both DigitalOcean-hosted and third-party commercial models, compare model capabilities and pricing, use routing to match inference requests to the best-fit model, and run inference using serverless or dedicated deployments.

Serverless Inference

digitalocean-product-icon-available-standalone-service
Serverless Inference Overview

What is serverless inference and how it differs from dedicated inference.

digitalocean-product-icon-available-standalone-service
How to Manage Serverless Inference Prepayment

Add a prepaid account balance for Serverless Inference and enable auto-reload to automatically replenish your balance when it runs low.

digitalocean-product-icon-available-standalone-service
Serverless Inference API Endpoints

Synchronous and asynchronous API endpoints for serverless inference.

digitalocean-product-icon-available-standalone-service
Use Serverless Inference

Send API requests directly to foundation models without creating an AI agent or managing infrastructure.

digitalocean-product-icon-available-standalone-service
How to Retrieve Available Models

How to retrieve models available for serverless inference.

digitalocean-product-icon-available-standalone-service
How to Send Prompts to a Model Using the Chat Completions API

Send prompts and use reasoning with the Chat Completions API.

digitalocean-product-icon-available-standalone-service
How to Send Prompts to a Model Using the Responses API

Send prompts with the Responses API.

digitalocean-product-icon-available-standalone-service
How to Use Prompt Caching in Chat Completions and Responses API

Use prompt caching with the Chat Completions and Responses API.

digitalocean-product-icon-available-standalone-service
How to Use Reasoning with the Chat Completions and Responses API

Use reasoning with the Chat Completions and Responses API.

digitalocean-product-icon-available-standalone-service
How to Generate Images from Text Prompts

Generate or edit images from text prompts.

digitalocean-product-icon-available-standalone-service
How to Use Multimodal Inference

Process and generate content across multiple data types, including images, audio, video, and text using multimodal models.

digitalocean-product-icon-available-standalone-service
How to Use fal Models to Generate Image, Audio, or Text-to-Speech

Generate image, audio, or text-to-speech using fal models.

digitalocean-product-icon-available-standalone-service
How to Convert Text Into Dense Vector Representations

Convert text into dense vector representations for use in semantic search, retrieval-augmented generation (RAG), clustering, classification, and similarity matching.

digitalocean-product-icon-available-standalone-service
How to View Serverless Inference Metrics

View metrics such as latency, throughput, error rates, token consumption, cost attribution, and rate limiting.

digitalocean-product-icon-available-standalone-service
How to Use Serverless Inference After Updating to Another Model

How to use serverless inference after updating a model.

Manage Model Catalog

digitalocean-product-icon-available-standalone-service
How to Browse Models in Model Catalog

Identify the right model for your use case by filtering available foundation models by capabilities and price.

digitalocean-product-icon-available-standalone-service
How to Import Your Own Models (BYOM)

Import Bring Your Own Models (BYOM) models into Model Catalog from Hugging Face or Spaces buckets and folders.

digitalocean-product-icon-available-standalone-service
Test and Compare Models Using the Model Playground

Test and compare foundation models in the Model Playground.

Use Dedicated Inference

digitalocean-product-icon-available-standalone-service
How to Use Dedicated Inference

Deploy open-source and commercial LLMs on dedicated GPUs as an inference endpoint.

Use Batch Inference

digitalocean-product-icon-available-standalone-service
How to Use Batch Inference

Batch inference runs text jobs asynchronously through batch APIs compatible with OpenAI and Anthropic using your serverless inference model access key.

Use Inference Router

digitalocean-product-icon-available-standalone-service
How to Use Inference Router

Create and configure an Inference Router to route inference requests to foundation models.

Agentic Workflows

digitalocean-product-icon-available-standalone-service
Use Messages API

Use the Messages API with Claude Code and similar agentic workflows.

Evaluate Models

digitalocean-product-icon-available-standalone-service
How to Evaluate Models

Determine which model best fits your specific use case.

Use Server-Side Tools

digitalocean-product-icon-available-standalone-service
How to Use Server-Side Tools

Extend model capabilities with server-side tools such as web search, knowledge base retrieval, MCP, and Anthropic and OpenAI passthrough tools, available for serverless inference, dedicated inference, and inference routers.

Use Agent Platform

digitalocean-product-icon-available-standalone-service
Use Agent Platform

Build fully-managed AI agents with knowledge bases for retrieval-augmented generation, multi-agent routing, guardrails, and more.

digitalocean-product-icon-available-standalone-service
How to Create Agents on DigitalOcean Inference

Create an agent with domain-specific knowledge to provide information or take action.

digitalocean-product-icon-available-standalone-service
How to Build, Test, and Deploy Agents Using Agent Development Kit

Use Agent Development Kit to create and manage agents.

digitalocean-product-icon-available-standalone-service
How to Add, Edit, or Delete Partner Provider Keys on DigitalOcean Inference

Add, edit, or delete API keys to use their models with your agents.

digitalocean-product-icon-available-standalone-service
How to Create and Manage Workspaces on DigitalOcean Inference

Create workspaces to group agents together and move agents between workspaces as needed.

digitalocean-product-icon-available-standalone-service
How to Use Agents in Your Applications on DigitalOcean Inference

Use your agent in an application or through a chat bot interface.

digitalocean-product-icon-available-standalone-service
How to Test Agents Using the Agent Playground on DigitalOcean Inference

Test the full agent experience with the directions and features you’ve configured using the Agent Playground.

digitalocean-product-icon-available-standalone-service
How to Evaluate Agent Performance on DigitalOcean Inference

Create test cases and measure how well your agents perform across criteria like factual accuracy, instruction following, and context relevance.

digitalocean-product-icon-available-standalone-service
How to Create an Evaluation Dataset on DigitalOcean Inference

Create evaluation datasets to test agent or model performance, improve accuracy, and measure qualities like factual correctness, safety, and instruction following.

digitalocean-product-icon-available-standalone-service
How to View Agent Metrics and Logs on DigitalOcean Inference

View metrics and runtime logs for your agents to troubleshoot issues.

digitalocean-product-icon-available-standalone-service
How to Route to Multiple Agents on DigitalOcean Inference

Integrate multiple generative AI agents.

digitalocean-product-icon-available-standalone-service
How to Route Functions in Agents on DigitalOcean Inference

Enable the foundation model in your agent to access the external data sources using functions.

digitalocean-product-icon-available-standalone-service
How to Rollback to a Previous Version of Agents on DigitalOcean Inference

Rollback to a previous version of an agent to undo changes made to it.

digitalocean-product-icon-available-standalone-service
Create and Manage Knowledge Bases

Create, edit, manage data sources, verify, and permanently destroy knowledge bases.

digitalocean-product-icon-available-standalone-service
Attach and Detach Agent Knowledge Bases

Attach or detach a knowledge base from your agents.

digitalocean-product-icon-available-standalone-service
Manage Agent Guardrails

Create, manage, edit, duplicate, or delete guardrails to control how your agents respond to sensitive or inappropriate content.

digitalocean-product-icon-available-standalone-service
How to Destroy Agents Using the Control Panel on DigitalOcean Inference

Destroy an agent to permanently and irreversibly destroy the agent and removes all endpoints for the agent.

digitalocean-product-icon-available-standalone-service
Test DigitalOcean Knowledge Base Retrieval Using RAG Playground

Test how foundation models answer questions using content retrieved from a knowledge base.

Manage Model Access Keys

digitalocean-product-icon-available-standalone-service
How to Create and Manage Model Access Keys

Create, scope, and manage model access keys for foundation models, inference routers, and batch inference, with VPC restrictions and team-owner visibility.

Use Coding Agents

digitalocean-product-icon-available-standalone-service
How to Use Coding Agents With DigitalOcean

Configure Codex CLI, Claude Code, Cline, OpenCode, Cursor, and OpenClaw to use inference with your model access key.

We can't find any results for your search.

Try using different keywords or simplifying your search terms.