Use POST /v1/chat/completions to run model inference with an OpenAI Chat Completions-style request body.
This is APIFree’s unified synchronous text generation endpoint. You choose the target model with the model field, and APIFree validates the request against that model’s schema, selects an available upstream provider, and returns either a standard JSON response or an SSE stream.
Authentication
Include your API key in the Authorization header:
Authorization: Bearer YOUR_API_KEY
Endpoint
POST /v1/chat/completions
Content-Type: application/json
Authorization: Bearer YOUR_API_KEY
Request body
The request body follows an OpenAI Chat Completions-style shape, but the exact supported fields depend on the selected model.
Required fields
| Name | Type | Required | Description |
|---|---|---|---|
model | string | Yes | APIFree model identifier. Must be a non-empty string. |
messages | array | Usually | Conversation input for chat-style models. Most LLM chat models require this field. |
Common fields
These fields are commonly supported for chat models, but availability and constraints are model-specific.
| Name | Type | Required | Description |
|---|---|---|---|
stream | boolean | No | When true, returns Server-Sent Events. |
temperature | number | No | Sampling temperature. |
top_p | number | No | Nucleus sampling parameter. |
max_tokens | integer | No | Maximum number of generated tokens. |
stop | string or array | No | Stop sequences. |
presence_penalty | number | No | Penalizes repeated topics. |
frequency_penalty | number | No | Penalizes repeated tokens. |
tools | array | No | Tool definitions for tool/function calling, if the model supports them. |
tool_choice | string or object | No | Tool selection strategy, if supported by the model. |
user | string | No | End-user identifier for your own tracking or abuse monitoring. |
messages format
Each item in messages is an object in conversation order.
Typical example:
{
"role": "user",
"content": "Write a short product introduction for APIFree."
}
| Field | Type | Required | Description |
|---|---|---|---|
role | string | Yes | Message role such as system, user, assistant, or tool. |
content | string or array | Yes | Message content. |
name | string | No | Optional participant or tool name. |
Example request
curl https://api.apifree.ai/v1/chat/completions \
-H "Authorization: Bearer YOUR_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"model": "gpt-5-mini",
"messages": [
{
"role": "system",
"content": "You are a concise assistant."
},
{
"role": "user",
"content": "Introduce APIFree in 3 sentences."
}
],
"temperature": 0.7,
"stream": false
}'from openai import OpenAI
client = OpenAI(
base_url="https://api.apifree.ai/v1",
api_key="YOUR_API_KEY",
)
resp = client.chat.completions.create(
model="gpt-5-mini",
messages=[
{"role": "system", "content": "You are a concise assistant."},
{"role": "user", "content": "Introduce APIFree in 3 sentences."},
],
)
print(resp.choices[0].message.content)Non-streaming response
When stream is omitted or false, the API returns a JSON object.
Example:
{
"id": "chatcmpl_abc123",
"object": "chat.completion",
"created": 1767699130,
"model": "gpt-5-mini",
"choices": [
{
"index": 0,
"message": {
"role": "assistant",
"content": "APIFree is a unified AI API platform that gives developers access to multiple model providers through a consistent interface. You can use one API key and one integration pattern to call different model families. It is designed to simplify production access, billing, and model switching."
},
"finish_reason": "stop"
}
],
"usage": {
"prompt_tokens": 32,
"completion_tokens": 45,
"total_tokens": 77,
"cost": 0.00014
}
}
Common top-level fields:
| Field | Type | Description |
|---|---|---|
id | string | Completion identifier. |
object | string | Usually chat.completion. |
created | integer | Unix timestamp. |
model | string | Model actually used for the request. |
choices | array | Generated outputs. |
usage | object | Token and billing usage when available. |
Streaming response
When stream=true, the API returns SSE with Content-Type: text/event-stream.
Example request:
curl -N https://api.apifree.ai/v1/chat/completions \
-H "Authorization: Bearer YOUR_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"model": "gpt-5-mini",
"messages": [
{
"role": "user",
"content": "Write a short welcome message."
}
],
"stream": true
}'
Example stream:
data: {"id":"chatcmpl_abc123","object":"chat.completion.chunk","created":1767699506,"model":"gpt-5-mini","choices":[{"index":0,"delta":{"role":"assistant","content":""},"finish_reason":null}]}
data: {"id":"chatcmpl_abc123","object":"chat.completion.chunk","created":1767699506,"model":"gpt-5-mini","choices":[{"index":0,"delta":{"content":"Hello"},"finish_reason":null}]}
data: {"id":"chatcmpl_abc123","object":"chat.completion.chunk","created":1767699506,"model":"gpt-5-mini","choices":[{"index":0,"delta":{"content":" and welcome to APIFree."},"finish_reason":null}]}
data: {"id":"chatcmpl_abc123","object":"chat.completion.chunk","created":1767699506,"model":"gpt-5-mini","choices":[{"index":0,"delta":{},"finish_reason":"stop"}]}
data: [DONE]
Model-specific parameters
/v1/chat/completions is a unified protocol entrypoint, but APIFree validates and transforms requests according to the selected model’s schema.
That means:
- Different models may support different optional fields.
- Field types, defaults, enum values, and numeric ranges can vary by model.
- Tool calling or advanced parameters may only be available for some models.
To inspect the exact contract for a model, use:
GET /v1/modelsGET /v1/model/openapi.json?model=YOUR_MODEL_IDGET /v1/model/api.md?model=YOUR_MODEL_ID
Error behavior
APIFree returns an OpenAI-style error object, but many request failures are still returned with HTTP status 200.
Example:
{
"code": 400,
"error": {
"code": "invalid_request_error",
"message": "model field is required in request body",
"type": "invalid_request_error"
}
}
Typical cases include:
- Missing or invalid
model - Invalid JSON request body
- Selected model not found
- No valid upstream provider available for the request
- Model temporarily unavailable due to platform-side traffic control
Concurrency limit errors may return:
{
"code": 429,
"error": {
"code": "max_concurrency_exceeded",
"message": "too many concurrent requests",
"type": "rate_limit_error"
}
}