Skip to main content

Budget & Quotas

Budget and quota controls prevent runaway usage and align AI spend with organizational objectives.

tip

AI cost is not a model problem — it is a governance problem. Set limits before you need them.


Budget vs Quota

ConceptUnitApplied to
BudgetUSD ($)Team or Virtual Key
QuotaTokens per minute (TPM) or Requests per minuteTeam or Virtual Key

Use Budget for financial control. Use Quota for operational control (e.g., rate limiting a specific app).


Edit a Team Budget

  1. Go to Teams.
  2. Open the Team by clicking on its ID or Edit team action.
  3. Go to the Settings tab.
  4. Click on Edit Settings button.
  5. Set:
    • Max Budget (USD): hard limit — requests are blocked once reached.
    • Soft Budget (USD): soft limit — trigger alerting emails
    • Soft Budget Alerting emails: set a comma separated list of alert recipients for team budget
    • Team Member Budget (USD): Set a budget per team member.
    • Reset budget: daily, weekly, or monthly reset period.
  6. Click Save Changes.
Email Integration Required

Team soft budget alerts are sent via email. An active email integration (SendGrid, Resend, or SMTP) must be configured on Dedicated Gateway instance for alerts to be delivered.

This can be done by our IG1 AI Professional Service Team on demand.

Ask IG1 Team through Support for more details.


Edit a Virtual Key Budget

  1. Go to Virtual Keys.
  2. Open the target key.
  3. Go to the Settings tab.
  4. Click on Edit Settings button.
  5. Set:
    • Max Budget (USD): hard limit — requests are blocked once reached.
    • Reset budget: daily, weekly, or monthly reset period.
  6. Click Save Changes.
note

Key-level budgets are additive with team budgets. A key cannot exceed its own limit even if the team budget has remaining capacity.


Set Token Quotas

  1. Open a Team or Virtual Key.
  2. Go to the Settings tab.
  3. Click on Edit Settings button..
  4. Set:
    • Requests per Minute (RPM)
    • Tokens per Minute (TPM)

What happens when a limit is reached?

Limit typeBehavior
Budget exceeded429 - Budget exceeded
RPM exceeded429 - Rate limit exceeded
TPM exceeded429 - Rate limit exceeded

Always handle 429 errors in your application with exponential backoff.