Codestral
Creates a code completion using the Codestral model from Mistral. All parameters from the Mistral fill-in-the-middle Completion endpoint are supported according to the Mistral specifications.
Example — cURL
curl --request POST \
--url https://api.langdock.com/mistral/{region}/v1/fim/completions \
--header 'Authorization: <authorization>' \
--header 'Content-Type: application/json' \
--data '
{
"model": "codestral-2501",
"prompt": "function removeSpecialCharactersWithRegex(str: string) {",
"max_tokens": 64
}
'Example response (200)
{
"data": "asd",
"id": "245c52bc936f53ba90327800c73d1c3e",
"object": "chat.completion",
"model": "codestral",
"usage": {
"prompt_tokens": 16,
"completion_tokens": 102,
"total_tokens": 118
},
"created": 1732902806,
"choices": [
{
"index": 0,
"message": {
"content": "\n // Use a regular expression to match any non-alphanumeric character and replace it with an empty string\n return str.replace(/[^a-zA-Z0-9]/g, '');\n}\n\n// Test the function\nconst inputString = \"Hello, World! 123\";\nconst outputString = removeSpecialCharactersWithRegex(inputString);\nconsole.log(outputString); // Output: \"HelloWorld123\"",
"prefix": false,
"role": "assistant"
},
"finish_reason": "stop"
}
]
}Rate limits
The rate limit for the FIM Completion endpoint is 500 RPM (requests per minute) and 60.000 TPM (tokens per minute). Rate limits are defined at the workspace level — not at an API key level. Each model has its own rate limit. If you exceed your rate limit, you will receive a 429 Too Many Requests response.
Please note that the rate limits are subject to change; refer to this documentation for the most up-to-date information. In case you need a higher rate limit, please contact us at [email protected].
Using the Continue AI Code Assistant
Using the Codestral model, combined with chat completion models from the Langdock API, makes it possible to use the open-source AI code assistant Continue (continue.dev) fully via the Langdock API. Continue is available as a VS Code extension and as a JetBrains extension.
To customize the models used by Continue, edit the configuration file at ~/.continue/config.json (macOS / Linux) or %USERPROFILE%\.continue\config.json (Windows). Example setup using Codestral for autocomplete and other models for chats/edits:
Endpoint
POST /mistral/{region}/v1/fim/completions
Try it with the example cURL shown above.
Headers
Authorization (string) — required API key as Bearer token. Format: "Bearer YOUR_API_KEY"
Path parameters
region (string, required) The region of the API to use.
Available options:
eu
Body (application/json)
model (string) — required, default: codestral-2501 ID of the model to use. Only compatible for now with:
codestral-2501
prompt (string) — required The text/code to complete.
temperature (number)
What sampling temperature to use; recommended between 0.0 and 0.7. Higher values (e.g., 0.7) make output more random; lower values (e.g., 0.2) make it more focused/deterministic. We generally recommend altering this or top_p, but not both. The default value varies by model. Call the /models endpoint to retrieve the appropriate default.
Required range: 0 <= x <= 1.5
top_p (number) — default: 1
Nucleus sampling: the model considers tokens comprising the top top_p probability mass. We generally recommend altering this or temperature, but not both.
Required range: 0 <= x <= 1
max_tokens (integer)
Maximum number of tokens to generate in the completion. The token count of your prompt plus max_tokens cannot exceed the model's context length.
Required range: x >= 0
stream (boolean) — default: false
Whether to stream back partial progress. If set, tokens are sent as data-only server-side events as they become available, with the stream terminated by a data: [DONE] message. Otherwise, the server returns the full result as JSON when complete.
stop (string | string[]) Stop generation if this token is detected. Or provide an array of tokens.
random_seed (integer) The seed to use for random sampling. If set, different calls will generate deterministic results.
Required range: x >= 0
suffix (string) — default: ""
Optional text/code that adds more context for the model. When given both a prompt and a suffix, the model will fill what is between them. When suffix is not provided, the model will simply execute completion starting with prompt.
min_tokens (integer) The minimum number of tokens to generate in the completion.
Required range: x >= 0
Response (200 — application/json)
Successful response fields:
model (string) — Example:
"mistral-small-latest"id (string) — Example:
"cmpl-e5cc70bb28c444948073e77776eb30ef"object (string) — Example:
"chat.completion"usage (object) — required
usage.prompt_tokens (integer) — Example:
16usage.completion_tokens (integer) — Example:
34usage.total_tokens (integer) — Example:
50
choices (array of ChatCompletionChoice objects)
index (integer) — Example:
0message (object) — contains the assistant's generated content
finish_reason (string enum) — Available:
stop,length,model_length,error,tool_calls. Example:"stop"
created (integer) — Example:
1702256327
Was this page helpful?
Yes / No
Responses are generated using AI and may contain mistakes.

