Z-AI / REST API

REST API reference

OpenAI-compatible. Use any HTTP client or the official OpenAI SDK by just swapping the base URL.

Base URL

bash

https://api.zyoralabs.com/v1

For local development: http://127.0.0.1:8011/v1.

Authentication

Every request needs an Authorization header carrying a Z-AI API key:

bash

Authorization: Bearer zk_live_...

Keys can be issued, capped (spend or model allowlist) and revoked from the console.

POST /v1/chat/completions

OpenAI-compatible chat completions. Streaming via standard SSE.

Request body

{
  "model": "openai/gpt-5.4-mini",           // provider/model id
  "messages": [
    { "role": "system",    "content": "..." },
    { "role": "user",      "content": "..." },
    { "role": "assistant", "content": "..." }
  ],
  "stream": false,            // true → SSE
  "temperature": 0.7,         // optional
  "max_tokens": 512,          // optional
  "stop": ["\n###"],          // optional
  "top_p": 0.9                // optional
}

Response (non-stream)

{
  "id": "cmpl_…",
  "object": "chat.completion",
  "model": "openai/gpt-5.4-mini",
  "choices": [
    { "index": 0,
      "message": { "role": "assistant", "content": "Hello!" },
      "finish_reason": "stop" }
  ],
  "usage": {
    "prompt_tokens": 12,
    "completion_tokens": 4,
    "total_tokens": 16
  }
}

Response (stream — text/event-stream)

bash

data: {"choices":[{"delta":{"content":"Hel"}}]}

data: {"choices":[{"delta":{"content":"lo!"}}]}

data: {"choices":[],"usage":{"total_tokens":16},"model":"openai/gpt-5.4-mini"}

data: [DONE]

GET /v1/models

Returns every model your key is permitted to call (intersection of provider keys you've added and any model allowlist on the key).

bash

curl https://api.zyoralabs.com/v1/models \
  -H "Authorization: Bearer $ZAI_KEY"

# {
#   "object": "list",
#   "data": [
#     { "id": "openai/gpt-5.4-mini",          "owned_by": "openai" },
#     { "id": "openai/gpt-5.4",               "owned_by": "openai" },
#     { "id": "anthropic/claude-4.5-sonnet",  "owned_by": "anthropic" },
#     ...
#   ]
# }

GET /v1/me

Identity and quota for the calling key — used by the CLI whoami and the VS Code extension sign-in flow.

{
  "auth": "api_key",
  "org": { "id": "...", "name": "My Workspace", "slug": "my-ws" },
  "api_key": {
    "id": "...",
    "name": "production-key",
    "prefix": "zk_live_0PkU",
    "spend_micro_usd": 18234,
    "spend_cap_micro_usd": 5000000,
    "fallback_model": "openai/gpt-5.4-mini"
  }
}

Error format

Standard HTTP status codes. Body is JSON with a detail field.

HTTP/1.1 401 Unauthorized
{ "detail": "Invalid API key" }

HTTP/1.1 402 Payment Required
{ "detail": "Spend cap reached: 5.00 USD" }

HTTP/1.1 429 Too Many Requests
{ "detail": "Rate limit exceeded" }

Spend, costs and units

All spend values are reported in micro-USD (1 USD = 1,000,000). So 18234 means $0.018234. We use BigInt-safe integers everywhere to avoid floating-point dust.

Next: Models & pricing Bring your own keys