One API key. Every LLM.
Z-AI is an OpenAI-compatible gateway in front of every major model provider. Stop juggling SDKs, keys and dashboards — point your app at one endpoint and switch models with a string.
Built for teams that ship to production, not just prototype.
OpenAI, Anthropic, Gemini, Groq, Mistral, Together, Fireworks and Nexula. 60+ models.
Drop-in. Works with the official openai SDK, langchain, llamaindex, anything that speaks /v1/chat/completions.
Issue, revoke, cap spend and pin allowed models per key. BYOK your provider credentials, encrypted at rest.
SSE streaming on every model. Configure a fallback model per key when the primary is down.
A CLI and a VS Code extension shipped on day one — chat, models, logs and spend from your terminal and editor.
Every request logged with tokens, latency, cost, fallback hits. No black box.
OpenAI-compatible chat completions. Stream or non-stream. Bring your favorite client.
One-shot chat, model arena, key management and live logs from your terminal.
Sidebar chat, code actions, ghost-text completions. Works in VS Code, Cursor, VSCodium, Windsurf.
- Open Z-AI → API keys and create a key (starts with
zk_live_…). - In Providers, paste a provider API key (e.g. OpenAI). It is encrypted at rest with Fernet — never exposed in plaintext.
- Make your first call:
curl https://api.zyoralabs.com/v1/chat/completions \
-H "Authorization: Bearer zk_live_..." \
-H "Content-Type: application/json" \
-d '{
"model": "openai/gpt-5.4-mini",
"messages": [{"role":"user","content":"Hello"}]
}'That request hits Z-AI, which calls OpenAI using your stored provider key, logs the request, deducts the cost, and returns the response.