Unified chat completions endpoint for generating conversational responses with OpenAI chat models.

Description

This endpoint interacts with OpenAI chat models, taking a series of messages as input and returning model-generated responses.
It is suitable for chatbots, customer support, assistant tools, natural language interfaces, and many other scenarios.

Request body parameters

NameTypeRequiredDescription
modelstringYesModel ID to use, for example gpt-4o-mini.
messagesarrayYesArray of messages representing the conversation history in order.
temperaturenumberNoSampling temperature in the range 0–2; higher values make output more creative. Default is 1.
top_pnumberNoNucleus sampling parameter; usually adjust either temperature or top_p, not both. Default is 1.
max_tokensintegerNoMaximum number of tokens to generate in the response.
nintegerNoNumber of responses to generate for each input. Default is 1.
streambooleanNoWhether to return results as a streaming SSE response.
stopstring / arrayNoUp to four sequences where the API will stop generating further tokens.
presence_penaltynumberNoPenalizes new tokens based on whether they appear in the text so far, reducing repetition of topics.
frequency_penaltynumberNoPenalizes new tokens based on their existing frequency in the text, reducing token repetition.
logit_biasobjectNoModifies the likelihood of specific tokens appearing in the completion.
userstringNoUnique identifier for the end user, to help OpenAI monitor and detect abuse.
toolsarrayNoArray of tool definitions used to enable function / tool calling.
tool_choicestring / objectNoTool calling strategy, such as "auto", "none", or forcing a specific tool.

messages array

Each message is an object. A typical example:
{
  "role": "user",
  "content": "Write a haiku about recursion in programming."
}
FieldTypeRequiredDescription
rolestringYesRole in the conversation: system, user, assistant, or tool.
contentstring / arrayYesMessage content, usually a string. For multimodal or rich content, this can be an array of segments.
namestringNoOptional name for the entity, often used for tools or specific personas.
Common role usage:
  • system: Defines global behavior and instructions, such as style and persona.
  • user: Content provided by the end user.
  • assistant: Previous model responses, used as context.
  • tool: Data returned from executing tools / functions.

Tool calling (tools and tool_choice)

Tool calling lets the model request that certain functions be executed, after which your application executes them and returns the result as a tool message. tools example:
[
  {
    "type": "function",
    "function": {
      "name": "get_current_weather",
      "description": "Get the current weather in a given city",
      "parameters": {
        "type": "object",
        "properties": {
          "location": {
            "type": "string",
            "description": "The city and state, e.g. San Francisco, CA"
          },
          "unit": {
            "type": "string",
            "enum": ["celsius", "fahrenheit"]
          }
        },
        "required": ["location"]
      }
    }
  }
]
tool_choice example:
"auto"
Or forcing a specific function call:
{
  "type": "function",
  "function": { "name": "get_current_weather" }
}

Response schema

A successful response includes the following top-level fields:
FieldTypeDescription
idstringUnique ID for this completion.
objectstringObject type, typically chat.completion or chat.completion.chunk (for streaming).
createdintegerUnix timestamp of when the completion was created.
modelstringID of the model used.
choicesarrayList of generated completion choices.
usageobjectToken usage statistics.
system_fingerprintstringSystem fingerprint for debugging (optional).
Each item in choices looks like:
{
  "index": 0,
  "message": {
    "role": "assistant",
    "content": "Hello there, how can I help you today?"
  },
  "finish_reason": "stop"
}
  • index: Position of this choice in the choices array.
  • message: The assistant message returned by the model.
  • finish_reason: Why generation stopped, such as stop, length, or tool_calls.
Example usage field:
{
  "prompt_tokens": 9,
  "completion_tokens": 12,
  "total_tokens": 21
}

Errors

Like other OpenAI APIs, this endpoint returns non-2xx HTTP codes with a standard error object when something goes wrong:
{
  "error": {
    "message": "You exceeded your current quota, please check your plan and billing details.",
    "type": "insufficient_quota",
    "param": null,
    "code": "insufficient_quota"
  }
}
Common error types include:
  • invalid_request_error: The request parameters are invalid or missing.
  • authentication_error: Authentication failed (missing or invalid API key).
  • rate_limit_error: You have hit a rate limit.
  • insufficient_quota: Your billing plan or credit is insufficient.