OpenAI Chat completion

Creates a model response for the given chat conversation. This endpoint follows the OpenAI Chat Completion specification and forwards requests to the Azure OpenAI endpoint.

Endpoint POST https://api.langdock.com/openai/{region}/v1/chat/completions

In dedicated deployments, api.langdock.com maps to /api/public

Authentication

Header: Authorization
Value: Bearer YOUR_API_KEY

Supported Models Currently supported models include: gpt-5, gpt-5-mini, gpt-5-nano, gpt-5-chat, gpt-4.1, gpt-4.1-mini, gpt-4.1-nano, o4-mini, o3, o3-mini, o1, o1-mini, o1-preview, gpt-4o, gpt-4o-mini

Note: If you use your own API keys in Langdock (BYOK), available models may differ — contact your admin.

Limits and unsupported parameters

Not supported: n, service_tier, parallel_tool_calls, stream_options
Each model has its own rate limit (workspace-level)
Default rate limit for this Chat Completion endpoint: 500 RPM (requests per minute) and 60,000 TPM (tokens per minute)
Exceeding limits returns 429 Too Many Requests
For higher limits contact [email protected]

Try it — Examples

curl --request POST \
  --url https://api.langdock.com/openai/{region}/v1/chat/completions \
  --header 'Authorization: <authorization>' \
  --header 'Content-Type: application/json' \
  --data '
{
  "model": "gpt-4o-mini",
  "messages": [
    {
      "role": "system",
      "content": "You are a helpful assistant."
    },
    {
      "role": "user",
      "content": "Write a short poem about cats."
    }
  ]
}
'

from openai import OpenAI
client = OpenAI(
  base_url="https://api.langdock.com/openai/eu/v1",
  api_key="<YOUR_LANGDOCK_API_KEY>"
)

completion = client.chat.completions.create(
  model="gpt-4o-mini",
  messages=[
    {"role": "user", "content": "Write a short poem about cats."}
  ]
)

print(completion.choices[0].message.content)

import { streamText } from "ai";
import { createOpenAI } from "@ai-sdk/openai";

const langdockProvider = createOpenAI({
  baseURL: "https://api.langdock.com/openai/eu/v1",
  apiKey: "<YOUR_LANGDOCK_API_KEY>",
});

const result = await streamText({
  model: langdockProvider("gpt-4o-mini"),
  prompt: "Write a short poem about cats",
});

for await (const textPart of result.textStream) {
  process.stdout.write(textPart);
}

Response example (200)

{
  "choices": [
    {
      "message": {
        "content": "In moonlit shadows soft they prowl,\nWith eyes aglow in night's dark cowl.",
        "role": "assistant"
      },
      "index": 0,
      "finish_reason": "stop",
      "logprobs": null
    }
  ],
  "created": 1721722200,
  "id": "chatcmpl-8o4sq3sSzGVqS0aQyjlXuuEGVZnSj",
  "model": "gpt-4o-2024-05-13",
  "object": "chat.completion",
  "system_fingerprint": "fp_asd28019bf",
  "usage": {
    "completion_tokens": 34,
    "prompt_tokens": 14,
    "total_tokens": 48
  }
}

Parameters

Headers

Authorization (string, required): API key as Bearer token. Format "Bearer YOUR_API_KEY"

Path parameters

region (string, required): The region of the API to use. Available: eu, us

Body (application/json)

model (string, required): ID of the model to use.
messages (array, required): A list of messages comprising the conversation so far. Minimum length: 1. Message roles: system, user, assistant, tool, function.
- message fields:
  - role (enum, required): e.g., system, user, assistant
  - content (string, required for messages that contain text)
  - name (string, optional): an optional name for the participant
max_tokens (integer, optional): Maximum number of tokens to generate.
temperature (number, optional, default 1): 0–2
top_p (number, optional, default 1): 0–1
frequency_penalty (number, default 0): -2.0 to 2.0
presence_penalty (number, default 0): -2.0 to 2.0
logit_bias (object): Map of token IDs to bias values (-100 to 100)
stop (string or array, optional): Up to 4 sequences where generation will stop
stream (boolean, optional, default false): If true, partial tokens are sent as server-sent events terminated by data: [DONE]
response_format (object, optional): { "type": "text" } or { "type": "json_object" } — JSON mode requires you to instruct the model to output JSON
seed (integer, optional, Beta): For best-effort deterministic sampling
user (string, optional): Unique identifier representing your end-user
tools (array of objects, optional): List of tools (functions) the model may call (max 128). Each tool: { type: "function", function: { ... } }
tool_choice (enum or object, optional): Controls tool-calling behavior. Options: none, auto, required, or specify a particular tool. Default: none when no tools present; auto if tools are present.

Deprecated / replaced fields

function_call (deprecated): Replaced by tool_choice
functions (deprecated): Replaced by tools

Details and notable behaviors

response_format.type: "text" or "json_object". When using "json_object", you must instruct the model to produce JSON in the conversation messages to avoid problematic behavior (e.g., streaming whitespace).
logprobs and top_logprobs: If logprobs is true, you can request top_logprobs (0–20) to get token probability info.
Tools/functions: Provide a JSON Schema in parameters for functions. Omitting parameters defines an empty parameter list.

Expandable: Full schema and field details

Show/hide full request and response field details

(Fields described above, plus nested attributes such as logit_bias.{key}, functions[].parameters as JSON Schema, usage.* fields in the response, system_fingerprint, finish_reason values, etc.)

finish_reason possible values:

stop
length
tool_calls
content_filter
function_call (deprecated)

Response fields:

id (string): Unique identifier for the chat completion
object (string): "chat.completion"
created (integer): Unix timestamp (seconds)
model (string)
choices (array): One or more choice objects, each with index, message, finish_reason, logprobs
usage (object): { completion_tokens, prompt_tokens, total_tokens }
system_fingerprint (string): Fingerprint of backend config (use with seed to monitor determinism)

Rate limits

Default for this endpoint: 500 RPM and 60,000 TPM (workspace-level)
Each model may have its own limits
Exceeding limits returns HTTP 429
For higher limits contact [email protected]

Using OpenAI-compatible libraries Because the request and response formats match OpenAI's API, you can use OpenAI-compatible libraries such as:

OpenAI Python library (openai-python)
Vercel AI SDK

Notes

Admins can create API keys in workspace settings.
If you use BYOK (bring-your-own-keys), model availability may differ — contact your admin.

Relevant links

OpenAI Chat Completion spec: https://platform.openai.com/docs/api-reference/chat/create
OpenAI models compatibility table: https://platform.openai.com/docs/models/model-endpoint-compatibility
Function calling guide: https://platform.openai.com/docs/guides/function-calling
Token counting example: https://cookbook.openai.com/examples/how_to_count_tokens_with_tiktoken

PreviousAPI Introduction NextAnthropic Messages