Give Feedback

Use Serverless Inference

Last verified 29 Jun 2026

Inference provides a single control plane for managing inference workflows. It includes a Model Catalog where you can view available foundation models, including both DigitalOcean-hosted and third-party commercial models, compare model capabilities and pricing, use routing to match inference requests to the best-fit model, and run inference using serverless or dedicated deployments.

Copy page as Markdown View page as Markdown

Get Started

Serverless Inference Overview

What is serverless inference and how it differs from dedicated inference.

How to Manage Serverless Inference Prepayment

Add a prepaid account balance for Serverless Inference and enable auto-reload to automatically replenish your balance when it runs low.

Serverless Inference API Endpoints

Synchronous and asynchronous API endpoints for serverless inference.

How to Create and Manage Model Access Keys

Create, scope, and manage model access keys for foundation models, inference routers, and batch inference, with VPC restrictions and team-owner visibility.

How to Retrieve Available Models

How to retrieve models available for serverless inference.

Generate Chat Completions

How to Send Prompts to a Model Using the Chat Completions API

Send prompts and use reasoning with the Chat Completions API.

How to Send Prompts to a Model Using the Responses API

Send prompts with the Responses API.

How to Use Prompt Caching in Chat Completions and Responses API

Use prompt caching with the Chat Completions and Responses API.

How to Use Reasoning with the Chat Completions and Responses API

Use reasoning with the Chat Completions and Responses API.

Generate Images, Audio, Videos, and Text-to-Speech

How to Generate Images from Text Prompts

Generate or edit images from text prompts.

How to Use Multimodal Inference

Process and generate content across multiple data types, including images, audio, video, and text using multimodal models.

How to Use fal Models to Generate Image, Audio, or Text-to-Speech

Generate image, audio, or text-to-speech using fal models.

Update Model

How to Use Serverless Inference After Updating to Another Model

How to use serverless inference after updating a model.

Use Serverless Inference

Get Started

Generate Chat Completions

Generate Images, Audio, Videos, and Text-to-Speech

Update Model

We can't find any results for your search.