OpenAI Chat completion
Creates a model response for the given chat conversation. This endpoint follows the OpenAI Chat Completion specification and forwards requests to the Azure OpenAI endpoint.
Endpoint POST https://api.langdock.com/openai/{region}/v1/chat/completions
In dedicated deployments, api.langdock.com maps to /api/public
Authentication
Header: Authorization
Value: Bearer YOUR_API_KEY
Supported Models Currently supported models include: gpt-5, gpt-5-mini, gpt-5-nano, gpt-5-chat, gpt-4.1, gpt-4.1-mini, gpt-4.1-nano, o4-mini, o3, o3-mini, o1, o1-mini, o1-preview, gpt-4o, gpt-4o-mini
Note: If you use your own API keys in Langdock (BYOK), available models may differ — contact your admin.
Limits and unsupported parameters
Not supported: n, service_tier, parallel_tool_calls, stream_options
Each model has its own rate limit (workspace-level)
Default rate limit for this Chat Completion endpoint: 500 RPM (requests per minute) and 60,000 TPM (tokens per minute)
Exceeding limits returns 429 Too Many Requests
For higher limits contact [email protected]
Try it — Examples
curl --request POST \
--url https://api.langdock.com/openai/{region}/v1/chat/completions \
--header 'Authorization: <authorization>' \
--header 'Content-Type: application/json' \
--data '
{
"model": "gpt-4o-mini",
"messages": [
{
"role": "system",
"content": "You are a helpful assistant."
},
{
"role": "user",
"content": "Write a short poem about cats."
}
]
}
'from openai import OpenAI
client = OpenAI(
base_url="https://api.langdock.com/openai/eu/v1",
api_key="<YOUR_LANGDOCK_API_KEY>"
)
completion = client.chat.completions.create(
model="gpt-4o-mini",
messages=[
{"role": "user", "content": "Write a short poem about cats."}
]
)
print(completion.choices[0].message.content)Response example (200)
Parameters
Headers
Authorization (string, required): API key as Bearer token. Format "Bearer YOUR_API_KEY"
Path parameters
region (string, required): The region of the API to use. Available:
eu,us
Body (application/json)
model (string, required): ID of the model to use.
messages (array, required): A list of messages comprising the conversation so far. Minimum length: 1. Message roles: system, user, assistant, tool, function.
message fields:
role (enum, required): e.g., system, user, assistant
content (string, required for messages that contain text)
name (string, optional): an optional name for the participant
max_tokens (integer, optional): Maximum number of tokens to generate.
temperature (number, optional, default 1): 0–2
top_p (number, optional, default 1): 0–1
frequency_penalty (number, default 0): -2.0 to 2.0
presence_penalty (number, default 0): -2.0 to 2.0
logit_bias (object): Map of token IDs to bias values (-100 to 100)
stop (string or array, optional): Up to 4 sequences where generation will stop
stream (boolean, optional, default false): If true, partial tokens are sent as server-sent events terminated by data: [DONE]
response_format (object, optional): { "type": "text" } or { "type": "json_object" } — JSON mode requires you to instruct the model to output JSON
seed (integer, optional, Beta): For best-effort deterministic sampling
user (string, optional): Unique identifier representing your end-user
tools (array of objects, optional): List of tools (functions) the model may call (max 128). Each tool: { type: "function", function: { ... } }
tool_choice (enum or object, optional): Controls tool-calling behavior. Options:
none,auto,required, or specify a particular tool. Default:nonewhen no tools present;autoif tools are present.
Deprecated / replaced fields
function_call (deprecated): Replaced by tool_choice
functions (deprecated): Replaced by tools
Details and notable behaviors
response_format.type: "text" or "json_object". When using "json_object", you must instruct the model to produce JSON in the conversation messages to avoid problematic behavior (e.g., streaming whitespace).
logprobs and top_logprobs: If logprobs is true, you can request top_logprobs (0–20) to get token probability info.
Tools/functions: Provide a JSON Schema in parameters for functions. Omitting parameters defines an empty parameter list.
Expandable: Full schema and field details
Rate limits
Default for this endpoint: 500 RPM and 60,000 TPM (workspace-level)
Each model may have its own limits
Exceeding limits returns HTTP 429
For higher limits contact [email protected]
Using OpenAI-compatible libraries Because the request and response formats match OpenAI's API, you can use OpenAI-compatible libraries such as:
OpenAI Python library (openai-python)
Vercel AI SDK
Notes
Admins can create API keys in workspace settings.
If you use BYOK (bring-your-own-keys), model availability may differ — contact your admin.
Relevant links
OpenAI Chat Completion spec: https://platform.openai.com/docs/api-reference/chat/create
OpenAI models compatibility table: https://platform.openai.com/docs/models/model-endpoint-compatibility
Function calling guide: https://platform.openai.com/docs/guides/function-calling
Token counting example: https://cookbook.openai.com/examples/how_to_count_tokens_with_tiktoken

