Use Serverless Inference

Last verified 29 Jun 2026

Inference provides a single control plane for managing inference workflows. It includes a Model Catalog where you can view available foundation models, including both DigitalOcean-hosted and third-party commercial models, compare model capabilities and pricing, use routing to match inference requests to the best-fit model, and run inference using serverless or dedicated deployments.

Get Started

digitalocean-product-icon-available-standalone-service
Serverless Inference Overview

What is serverless inference and how it differs from dedicated inference.

digitalocean-product-icon-available-standalone-service
How to Manage Serverless Inference Prepayment

Add a prepaid account balance for Serverless Inference and enable auto-reload to automatically replenish your balance when it runs low.

digitalocean-product-icon-available-standalone-service
Serverless Inference API Endpoints

Synchronous and asynchronous API endpoints for serverless inference.

digitalocean-product-icon-available-standalone-service
How to Create and Manage Model Access Keys

Create, scope, and manage model access keys for foundation models, inference routers, and batch inference, with VPC restrictions and team-owner visibility.

digitalocean-product-icon-available-standalone-service
How to Retrieve Available Models

How to retrieve models available for serverless inference.

Generate Chat Completions

digitalocean-product-icon-available-standalone-service
How to Send Prompts to a Model Using the Chat Completions API

Send prompts and use reasoning with the Chat Completions API.

digitalocean-product-icon-available-standalone-service
How to Send Prompts to a Model Using the Responses API

Send prompts with the Responses API.

digitalocean-product-icon-available-standalone-service
How to Use Prompt Caching in Chat Completions and Responses API

Use prompt caching with the Chat Completions and Responses API.

digitalocean-product-icon-available-standalone-service
How to Use Reasoning with the Chat Completions and Responses API

Use reasoning with the Chat Completions and Responses API.

Generate Images, Audio, Videos, and Text-to-Speech

digitalocean-product-icon-available-standalone-service
How to Generate Images from Text Prompts

Generate or edit images from text prompts.

digitalocean-product-icon-available-standalone-service
How to Use Multimodal Inference

Process and generate content across multiple data types, including images, audio, video, and text using multimodal models.

digitalocean-product-icon-available-standalone-service
How to Use fal Models to Generate Image, Audio, or Text-to-Speech

Generate image, audio, or text-to-speech using fal models.

Update Model

digitalocean-product-icon-available-standalone-service
How to Use Serverless Inference After Updating to Another Model

How to use serverless inference after updating a model.

We can't find any results for your search.

Try using different keywords or simplifying your search terms.