Assistant API

Creates a model response for a given assistant id, or pass in an Assistant configuration that should be used for your request.

To share an assistant with an API key, follow this guide: https://docs.langdock.com/api-endpoints/assistant/assistant-api-guide

Endpoint POST https://api.langdock.com/assistant/v1/chat/completions

cURL example

curl
curl --request POST \
  --url https://api.langdock.com/assistant/v1/chat/completions \
  --header 'Authorization: <authorization>' \
  --header 'Content-Type: application/json' \
  --data '
{
  "assistantId": "asst_123",
  "messages": [
    {
      "role": "user",
      "content": "Hello, how can you help me?"
    }
  ],
  "stream": false,
  "maxSteps": 10
}
'

Response (successful)


Request Parameters

Parameter
Type
Required
Description

assistantId

string

One of assistantId/assistant required

ID of an existing assistant to use

assistant

object

One of assistantId/assistant required

Configuration for a new assistant

messages

array

Yes

Array of message objects with role and content

stream

boolean

No

Enable streaming responses (default: false)

output

object

No

Structured output format specification

maxSteps

integer

No (default: 10)

Maximum number of steps the assistant can take (1–20)

Message Format

Each message in the messages array should contain:

  • role (required) — One of: user, assistant, or tool

  • content (required) — The message content as a string

  • attachmentIds (optional) — Array of UUID strings identifying attachments for this message

Assistant Configuration

When creating a temporary assistant (use assistant instead of assistantId), you can specify:

  • name (required) — Name of the assistant (max 64 chars)

  • instructions (required) — System instructions (max 16384 chars)

  • description — Optional description (max 256 chars)

  • temperature — Temperature between 0–1

  • model — Model ID to use (see Available Models: https://docs.langdock.com/api-endpoints/assistant/assistant-models)

  • capabilities — Enable features like web search, data analysis, image generation

  • actions — Custom API integrations

  • vectorDb — Vector database connections

  • knowledgeFolderIds — IDs of knowledge folders to use

  • attachmentIds — Array of UUID strings identifying attachments to use

You can retrieve a list of available models using the Models API: https://docs.langdock.com/api-endpoints/assistant/assistant-models


Structured Output

You can specify a structured output format using the optional output parameter.

Field
Type
Description

type

"object" | "array" | "enum"

The type of structured output

schema

object

JSON Schema definition for the output (for object/array types)

enum

string[]

Array of allowed values (for enum type)

Behavior:

  • type: "object" with no schema: Forces the response to be a single JSON object (no specific structure)

  • type: "object" with schema: Forces the response to match the provided JSON Schema

  • type: "array" with schema: Forces the response to be an array of objects matching the provided schema

  • type: "enum": Forces the response to be one of the values in enum

Tools such as easy-json-schema can help generate JSON Schema: https://easy-json-schema.github.io/


Streaming Responses

When stream is true, the API returns server-sent events (SSE) allowing progressive display of generated responses.

Stream format example:

Handling streams in JavaScript:


Obtaining Attachment IDs

To use attachments in conversations, first upload files using the Upload Attachment API: https://docs.langdock.com/api-endpoints/assistant/upload-attachments. The upload returns attachmentId values which you can include in attachmentIds.


Examples

Using an existing assistant

Using a temporary assistant configuration

Using Structured Output with Schema (array)

Using Structured Output with Object

Using Structured Output with Enum


Rate limits

The rate limit for the Assistant Completion endpoint is:

  • 500 RPM (requests per minute)

  • 60,000 TPM (tokens per minute)

Rate limits are defined at the workspace level (not at an API key level). Each model has its own rate limit. If you exceed your rate limit, you will receive a 429 Too Many Requests response.

For higher rate limits, contact: [email protected]


Response Format

The API returns an object containing:

  • result: Array containing the full conversation and any tool calls (always present).

  • output: Present when output parameter was specified in the request. Contains formatted structured data (object, array, or enum string).

Example (weather structured output)

The output field is populated automatically and can be consumed directly by your application.


Error Handling

Example (axios):


Headers

Authorization (required)

  • Type: string

  • Format: "Bearer YOUR_API_KEY"

Content-Type: application/json


Body (application/json)

Option 1: Provide assistantId (use existing assistant)

  • assistantId (string, required when using this option)

  • messages (array, required)

Option 2: Provide assistant object (temporary assistant configuration)

  • assistant (object, required when using this option)

  • messages (array, required)

Common body fields

  • messages (array of objects) — each item:

    • role (enum: user, assistant, tool) — required

    • content (string) — required

    • attachmentIds (string[uuid]) — optional

  • stream (boolean, default: false) — when true returns SSE

  • output (object) — structured output spec:

    • type (enum: object | array | enum)

    • schema (object) — JSON Schema for object/array types

    • enum (string[]) — required when type: "enum"


Response Codes

  • 200 — Successful chat completion (application/json)

  • 400 — Invalid parameters

  • 429 — Rate limit exceeded

  • 500 — Server error


Related

  • Sharing Assistants with API Keys: https://docs.langdock.com/api-endpoints/assistant/assistant-api-guide

  • Models for Assistant API: https://docs.langdock.com/api-endpoints/assistant/assistant-models

Notes

  • Responses are generated using AI and may contain mistakes.