Codestral

Creates a code completion using the Codestral model from Mistral. All parameters from the Mistral fill-in-the-middle Completion endpoint are supported according to the Mistral specifications.

Example — cURL

curl --request POST \
  --url https://api.langdock.com/mistral/{region}/v1/fim/completions \
  --header 'Authorization: <authorization>' \
  --header 'Content-Type: application/json' \
  --data '
{
  "model": "codestral-2501",
  "prompt": "function removeSpecialCharactersWithRegex(str: string) {",
  "max_tokens": 64
}
'

Example response (200)

{
  "data": "asd",
  "id": "245c52bc936f53ba90327800c73d1c3e",
  "object": "chat.completion",
  "model": "codestral",
  "usage": {
    "prompt_tokens": 16,
    "completion_tokens": 102,
    "total_tokens": 118
  },
  "created": 1732902806,
  "choices": [
    {
      "index": 0,
      "message": {
        "content": "\n  // Use a regular expression to match any non-alphanumeric character and replace it with an empty string\n  return str.replace(/[^a-zA-Z0-9]/g, '');\n}\n\n// Test the function\nconst inputString = \"Hello, World! 123\";\nconst outputString = removeSpecialCharactersWithRegex(inputString);\nconsole.log(outputString); // Output: \"HelloWorld123\"",
        "prefix": false,
        "role": "assistant"
      },
      "finish_reason": "stop"
    }
  ]
}

Rate limits

The rate limit for the FIM Completion endpoint is 500 RPM (requests per minute) and 60.000 TPM (tokens per minute). Rate limits are defined at the workspace level — not at an API key level. Each model has its own rate limit. If you exceed your rate limit, you will receive a 429 Too Many Requests response.

Please note that the rate limits are subject to change; refer to this documentation for the most up-to-date information. In case you need a higher rate limit, please contact us at [email protected].

Using the Continue AI Code Assistant

Using the Codestral model, combined with chat completion models from the Langdock API, makes it possible to use the open-source AI code assistant Continue (continue.dev) fully via the Langdock API. Continue is available as a VS Code extension and as a JetBrains extension.

To customize the models used by Continue, edit the configuration file at ~/.continue/config.json (macOS / Linux) or %USERPROFILE%\.continue\config.json (Windows). Example setup using Codestral for autocomplete and other models for chats/edits:

{
  "models": [
    {
      "title": "GPT-4o",
      "provider": "openai",
      "model": "gpt-4o",
      "apiKey": "<YOUR_LANGDOCK_API_KEY>",
      "apiBase": "https://api.langdock.com/openai/eu/v1"
    },
    {
      "title": "Claude 3.5 Sonnet",
      "provider": "anthropic",
      "model": "claude-3-5-sonnet-20240620",
      "apiKey": "<YOUR_LANGDOCK_API_KEY>",
      "apiBase": "https://api.langdock.com/anthropic/eu/v1"
    }
  ],
  "tabAutocompleteModel": {
    "title": "Codestral",
    "provider": "mistral",
    "model": "codestral-2501",
    "apiKey": "<YOUR_LANGDOCK_API_KEY>",
    "apiBase": "https://api.langdock.com/mistral/eu/v1"
  }
  /* ... other configuration ... */
}

Endpoint

POST /mistral/{region}/v1/fim/completions

Try it with the example cURL shown above.

Headers

Authorization (string) — required API key as Bearer token. Format: "Bearer YOUR_API_KEY"

Path parameters

region (string, required) The region of the API to use.

Available options:

eu

Body (application/json)

model (string) — required, default: codestral-2501 ID of the model to use. Only compatible for now with:

codestral-2501

prompt (string) — required The text/code to complete.

temperature (number) What sampling temperature to use; recommended between 0.0 and 0.7. Higher values (e.g., 0.7) make output more random; lower values (e.g., 0.2) make it more focused/deterministic. We generally recommend altering this or top_p, but not both. The default value varies by model. Call the /models endpoint to retrieve the appropriate default.

Required range: 0 <= x <= 1.5

top_p (number) — default: 1 Nucleus sampling: the model considers tokens comprising the top top_p probability mass. We generally recommend altering this or temperature, but not both.

Required range: 0 <= x <= 1

max_tokens (integer) Maximum number of tokens to generate in the completion. The token count of your prompt plus max_tokens cannot exceed the model's context length.

Required range: x >= 0

stream (boolean) — default: false Whether to stream back partial progress. If set, tokens are sent as data-only server-side events as they become available, with the stream terminated by a data: [DONE] message. Otherwise, the server returns the full result as JSON when complete.

stop (string | string[]) Stop generation if this token is detected. Or provide an array of tokens.

random_seed (integer) The seed to use for random sampling. If set, different calls will generate deterministic results.

Required range: x >= 0

suffix (string) — default: "" Optional text/code that adds more context for the model. When given both a prompt and a suffix, the model will fill what is between them. When suffix is not provided, the model will simply execute completion starting with prompt.

min_tokens (integer) The minimum number of tokens to generate in the completion.

Required range: x >= 0

Response (200 — application/json)

Successful response fields:

model (string) — Example: "mistral-small-latest"
id (string) — Example: "cmpl-e5cc70bb28c444948073e77776eb30ef"
object (string) — Example: "chat.completion"
usage (object) — required
- usage.prompt_tokens (integer) — Example: 16
- usage.completion_tokens (integer) — Example: 34
- usage.total_tokens (integer) — Example: 50
choices (array of ChatCompletionChoice objects)
- index (integer) — Example: 0
- message (object) — contains the assistant's generated content
- finish_reason (string enum) — Available: stop, length, model_length, error, tool_calls. Example: "stop"
created (integer) — Example: 1702256327

Was this page helpful?

Yes / No

Responses are generated using AI and may contain mistakes.

PreviousGoogle Completion API NextOpenAI Embeddings

hashtagExample — cURL

hashtagExample response (200)

hashtagRate limits

hashtagUsing the Continue AI Code Assistant

hashtagEndpoint

hashtagHeaders

hashtagPath parameters

hashtagBody (application/json)

hashtagResponse (200 — application/json)

Example — cURL

Example response (200)

Rate limits

Using the Continue AI Code Assistant

Endpoint

Headers

Path parameters

Body (application/json)

Response (200 — application/json)