API Documentation

TokenFlow is fully compatible with the OpenAI API format. Just change your base URL.

⚡ Quick Start

Get up and running in 30 seconds. TokenFlow uses the same API format as OpenAI.

bash

# Install the OpenAI SDK
pip install openai

python

from openai import OpenAI

client = OpenAI(
    base_url="https://api.tokenflow.ai/v1",
    api_key="tf-your-api-key"
)

response = client.chat.completions.create(
    model="deepseek-v3",  # or "auto" for smart routing
    messages=[
        {"role": "system", "content": "You are a helpful assistant."},
        {"role": "user", "content": "Explain quantum computing in 3 sentences."}
    ],
    temperature=0.7,
    max_tokens=500
)

print(response.choices[0].message.content)

📦 Node.js

javascript

import OpenAI from 'openai';

const client = new OpenAI({
  baseURL: 'https://api.tokenflow.ai/v1',
  apiKey: 'tf-your-api-key',
});

const response = await client.chat.completions.create({
  model: 'auto',
  messages: [{ role: 'user', content: 'Hello!' }],
});

console.log(response.choices[0].message.content);

🔗 cURL

bash

curl https://api.tokenflow.ai/v1/chat/completions \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer tf-your-api-key" \
  -d '{
    "model": "auto",
    "messages": [
      {"role": "user", "content": "What is model routing?"}
    ],
    "stream": true
  }'

🧠 Smart Model Routing

Use model: "auto" to let TokenFlow automatically select the best model based on:

Complexity analysis — Simple tasks → cheap models, complex tasks → premium models
Cost optimization — Always picks the most cost-effective option
Latency targets — Routes to the fastest available endpoint
Fallback chains — Automatic retry with alternative models on failure

Or specify a model directly: deepseek-v3, kimi-k2, gpt-4o, etc.

📋 Supported Models

Model ID	Provider	Input $/1M	Output $/1M
`deepseek-v3`	DeepSeek	$0.14	$0.28
`deepseek-r1`	DeepSeek	$0.55	$2.19
`kimi-k2`	Moonshot	$0.20	$0.60
`glm-4-plus`	Zhipu	$0.15	$0.50
`qwen-max`	Alibaba	$0.16	$0.64
`gpt-4o`	OpenAI	$2.50	$10.00
`claude-sonnet`	Anthropic	$3.00	$15.00

⚠️ Error Handling

TokenFlow returns standard OpenAI-format errors:

json

{
  "error": {
    "message": "Rate limit exceeded. Try again in 1.5s",
    "type": "rate_limit_error",
    "code": "rate_limit_exceeded"
  }
}

Status	Meaning
401	Invalid API key
429	Rate limit exceeded
500	Upstream model error (auto-retried)
503	Service temporarily unavailable

🔧 Architecture: Powered by One-API

TokenFlow is built on top of the open-source One-API project, which provides:

Unified API gateway — All models accessed through one OpenAI-compatible endpoint
Load balancing — Automatic distribution across multiple upstream API keys
Token accounting — Per-user quota and billing at the token level
Rate limiting — Configurable per-user, per-model request throttling
Channel management — Hot-swap upstream providers without downtime

Self-hosting? Deploy with Docker:

bash

docker run -d --restart always \
  --name one-api \
  -p 3000:3000 \
  -e TZ=Asia/Shanghai \
  -v /home/ubuntu/data/one-api:/data \
  justsong/one-api