Install
First, install the AIsa CLI if you have not already:What can agents do with it?
Model matching
Pick a model based on task type and constraints.
Provider coverage
Route across GPT, Claude, Gemini, Qwen, DeepSeek, Grok, and more.
Cost-aware routing
Choose cheaper models when quality needs allow it.
Fallback planning
Suggest alternates when a model is unavailable.
🔥 What Can You Do?
Multi-Model Chat
Model Comparison
Vision Analysis
Cost Optimization
Fallback Strategy
Why LLM Router?
| Feature | LLM Router | Direct APIs |
|---|---|---|
| API Keys | 1 | 10+ |
| SDK Compatibility | OpenAI SDK | Multiple SDKs |
| Billing | Unified | Per-provider |
| Model Switching | Change string | Code rewrite |
| Fallback Routing | Built-in | DIY |
| Cost Tracking | Unified | Fragmented |
Supported Model Families
| Family | Developer | Example Models |
|---|---|---|
| GPT | OpenAI | gpt-4.1, gpt-4o, gpt-4o-mini, o1, o1-mini, o3-mini |
| Claude | Anthropic | claude-3-5-sonnet, claude-3-opus, claude-3-sonnet |
| Gemini | gemini-3-pro-preview, gemini-3.5-flash | |
| Qwen | Alibaba | qwen-max, qwen-plus, qwen2.5-72b-instruct |
| Deepseek | Deepseek | deepseek-chat, deepseek-coder, deepseek-v3, deepseek-r1 |
| Grok | xAI | grok-2, grok-beta |
Note: Model availability may vary. Check console.aisa.one/pricing for the full list of currently available models and pricing.
Quick Start
API Endpoints
OpenAI-Compatible Chat Completions
Request
Parameters
| Parameter | Type | Required | Description |
|---|---|---|---|
model | string | Yes | Model identifier (e.g., gpt-4.1, claude-3-sonnet) |
messages | array | Yes | Conversation messages |
temperature | number | No | Randomness (0-2, default: 1) |
max_tokens | integer | No | Maximum response tokens |
stream | boolean | No | Enable streaming (default: false) |
top_p | number | No | Nucleus sampling (0-1) |
frequency_penalty | number | No | Frequency penalty (-2 to 2) |
presence_penalty | number | No | Presence penalty (-2 to 2) |
stop | string/array | No | Stop sequences |
Message Format
Response
Streaming Response
Vision / Image Analysis
Analyze images by passing image URLs or base64 data:Function Calling
Enable tools/functions for structured outputs:Google Gemini Format
For Gemini models, you can also use the native format:Python Client
Installation
No installation required - uses standard library only.CLI Usage
Python SDK Usage
Use Cases
1. Cost-Optimized Routing
Use cheaper models for simple tasks:2. Fallback Strategy
Automatic fallback on failure:3. Model A/B Testing
Compare model outputs:4. Specialized Model Selection
Choose the best model for each task:Error Handling
Errors return JSON witherror field:
401- Invalid or missing API key402- Insufficient credits404- Model not found429- Rate limit exceeded500- Server error
Best Practices
- Use streaming for long responses to improve UX
- Set max_tokens to control costs
- Implement fallback for production reliability
- Cache responses for repeated queries
- Monitor usage via response metadata
- Use appropriate models - don’t use GPT-4 for simple tasks
OpenAI SDK Compatibility
Just change the base URL and key:Pricing
Token-based pricing varies by model. Check console.aisa.one/pricing for current rates.| Model Family | Approximate Cost |
|---|---|
| GPT-4.1 / GPT-4o | ~$0.01 / 1K tokens |
| Claude-3-Sonnet | ~$0.01 / 1K tokens |
| Gemini-2.0-Flash | ~$0.001 / 1K tokens |
| Qwen-Max | ~$0.005 / 1K tokens |
| DeepSeek-V3 | ~$0.002 / 1K tokens |
usage.cost and usage.credits_remaining.
Get Started
- Sign up at aisa.one
- Get your API key from the dashboard
- Add credits (pay-as-you-go)
- Set environment variable:
export AISA_API_KEY="your-key"
Full API Reference
See API Reference for complete endpoint documentation.Get started
- Sign up at aisa.one (new accounts start with $2 free credit).
- Generate an API key from the console.
- Set your key and install the skill:
- Start a new agent session so the runtime loads the updated skill instructions.
Related
Model Catalog
Browse supported model IDs and families.
Compare models
Compare models before routing production traffic.
AIsa CN-LLM Route
Chinese-language model routing.