Budget & Quotas
Budget and quota controls prevent runaway usage and align AI spend with organizational objectives.
AI cost is not a model problem — it is a governance problem. Set limits before you need them.
Budget vs Quota
| Concept | Unit | Applied to |
|---|---|---|
| Budget | USD ($) | Team or Virtual Key |
| Quota | Tokens per minute (TPM) or Requests per minute | Team or Virtual Key |
Use Budget for financial control. Use Quota for operational control (e.g., rate limiting a specific app).
Edit a Team Budget
- Go to Teams.
- Open the Team by clicking on its ID or
Edit teamaction. - Go to the Settings tab.
- Click on Edit Settings button.
- Set:
- Max Budget (USD): hard limit — requests are blocked once reached.
- Soft Budget (USD): soft limit — trigger alerting emails
- Soft Budget Alerting emails: set a comma separated list of alert recipients for team budget
- Team Member Budget (USD): Set a budget per team member.
- Reset budget:
daily,weekly, ormonthlyreset period.
- Click Save Changes.
Team soft budget alerts are sent via email. An active email integration (SendGrid, Resend, or SMTP) must be configured on Dedicated Gateway instance for alerts to be delivered.
This can be done by our IG1 AI Professional Service Team on demand.
Ask IG1 Team through Support for more details.
Edit a Virtual Key Budget
- Go to Virtual Keys.
- Open the target key.
- Go to the Settings tab.
- Click on Edit Settings button.
- Set:
- Max Budget (USD): hard limit — requests are blocked once reached.
- Reset budget:
daily,weekly, ormonthlyreset period.
- Click Save Changes.
Key-level budgets are additive with team budgets. A key cannot exceed its own limit even if the team budget has remaining capacity.
Set Token Quotas
- Open a Team or Virtual Key.
- Go to the Settings tab.
- Click on Edit Settings button..
- Set:
- Requests per Minute (RPM)
- Tokens per Minute (TPM)
What happens when a limit is reached?
| Limit type | Behavior |
|---|---|
| Budget exceeded | 429 - Budget exceeded |
| RPM exceeded | 429 - Rate limit exceeded |
| TPM exceeded | 429 - Rate limit exceeded |
Always handle 429 errors in your application with exponential backoff.