v1.77.5-stable - MCP OAuth 2.0 Support
Deploy this versionโ
- Docker
- Pip
docker run \
-e STORE_MODEL_IN_DB=True \
-p 4000:4000 \
ghcr.io/berriai/litellm:v1.77.5-stable
pip install litellm==1.77.5
Key Highlightsโ
- MCP OAuth 2.0 Support - Enhanced authentication for Model Context Protocol integrations
- Scheduled Key Rotations - Automated key rotation capabilities for enhanced security
- New Gemini 2.5 Flash & Flash-lite Models - Latest September 2025 preview models with improved pricing and features
- Performance Improvements - 54% RPS improvement
Scheduled Key Rotationsโ
This release brings support for scheduling virtual key rotations on LiteLLM AI Gateway.
This is great for Proxy Admins looking to enforce Enterprise Grade security for use cases going through LiteLLM AI Gateway.
From this release you can enforce Virtual Keys to rotate on a schedule of your choice e.g every 15 days/30 days/60 days etc.
Performance Improvements - 54% RPS Improvementโ
This release brings a 54% RPS improvement (1,040 โ 1,602 RPS, aggregated) per instance.
The improvement comes from fixing O(nยฒ) inefficiencies in the LiteLLM Router, primarily caused by repeated use of in
statements inside loops over large arrays.
Tests were run with a database-only setup (no cache hits).
Test Setupโ
All benchmarks were executed using Locust with 1,000 concurrent users and a ramp-up of 500. The environment was configured to stress the routing layer and eliminate caching as a variable.
System Specs
- CPU: 8 vCPUs
- Memory: 32 GB RAM
Configuration (config.yaml)
View the complete configuration: gist.github.com/AlexsanderHamir/config.yaml
Load Script (no_cache_hits.py)
View the complete load testing script: gist.github.com/AlexsanderHamir/no_cache_hits.py
New Models / Updated Modelsโ
New Model Supportโ
Provider | Model | Context Window | Input ($/1M tokens) | Output ($/1M tokens) | Features |
---|---|---|---|---|---|
Gemini | gemini-2.5-flash-preview-09-2025 | 1M | $0.30 | $2.50 | Chat, reasoning, vision, audio |
Gemini | gemini-2.5-flash-lite-preview-09-2025 | 1M | $0.10 | $0.40 | Chat, reasoning, vision, audio |
Gemini | gemini-flash-latest | 1M | $0.30 | $2.50 | Chat, reasoning, vision, audio |
Gemini | gemini-flash-lite-latest | 1M | $0.10 | $0.40 | Chat, reasoning, vision, audio |
DeepSeek | deepseek-chat | 131K | $0.60 | $1.70 | Chat, function calling, caching |
DeepSeek | deepseek-reasoner | 131K | $0.60 | $1.70 | Chat, reasoning |
Bedrock | deepseek.v3-v1:0 | 164K | $0.58 | $1.68 | Chat, reasoning, function calling |
Azure | azure/gpt-5-codex | 272K | $1.25 | $10.00 | Responses API, reasoning, vision |
OpenAI | gpt-5-codex | 272K | $1.25 | $10.00 | Responses API, reasoning, vision |
SambaNova | sambanova/DeepSeek-V3.1 | 33K | $3.00 | $4.50 | Chat, reasoning, function calling |
SambaNova | sambanova/gpt-oss-120b | 131K | $3.00 | $4.50 | Chat, reasoning, function calling |
Bedrock | qwen.qwen3-coder-480b-a35b-v1:0 | 262K | $0.22 | $1.80 | Chat, reasoning, function calling |
Bedrock | qwen.qwen3-235b-a22b-2507-v1:0 | 262K | $0.22 | $0.88 | Chat, reasoning, function calling |
Bedrock | qwen.qwen3-coder-30b-a3b-v1:0 | 262K | $0.15 | $0.60 | Chat, reasoning, function calling |
Bedrock | qwen.qwen3-32b-v1:0 | 131K | $0.15 | $0.60 | Chat, reasoning, function calling |
Vertex AI | vertex_ai/qwen/qwen3-next-80b-a3b-instruct-maas | 262K | $0.15 | $1.20 | Chat, function calling |
Vertex AI | vertex_ai/qwen/qwen3-next-80b-a3b-thinking-maas | 262K | $0.15 | $1.20 | Chat, function calling |
Vertex AI | vertex_ai/deepseek-ai/deepseek-v3.1-maas | 164K | $1.35 | $5.40 | Chat, reasoning, function calling |
OpenRouter | openrouter/x-ai/grok-4-fast:free | 2M | $0.00 | $0.00 | Chat, reasoning, function calling |
XAI | xai/grok-4-fast-reasoning | 2M | $0.20 | $0.50 | Chat, reasoning, function calling |
XAI | xai/grok-4-fast-non-reasoning | 2M | $0.20 | $0.50 | Chat, function calling |
Featuresโ
- Gemini
- XAI
- Add xai/grok-4-fast models - PR #14833
- Anthropic
- Bedrock
- Vertex AI
- SambaNova
- Add sambanova deepseek v3.1 and gpt-oss-120b - PR #14866
- OpenAI
- OpenRouter
- Add gpt-5 and gpt-5-codex to OpenRouter cost map - PR #14879
- VLLM
- Fix vllm passthrough - PR #14778
- Flux
- Support flux image edit - PR #14790
Bug Fixesโ
- Anthropic
- OpenAI
- Fix a bug where openai image edit silently ignores multiple images - PR #14893
- VLLM
- Fix: vLLM provider's rerank endpoint from /v1/rerank to /rerank - PR #14938
New Provider Supportโ
- W&B Inference
- Add W&B Inference to LiteLLM - PR #14416
LLM API Endpointsโ
Featuresโ
- General
Bugsโ
- General
Management Endpoints / UIโ
Featuresโ
- Proxy CLI Auth
Virtual Keys
- Initial support for scheduled key rotations - PR #14877
- Allow scheduling key rotations when creating virtual keys - PR #14960
Models + Endpoints
- Fix: added Oracle to provider's list - PR #14835
Bugsโ
- SSO - Fix: SSO "Clear" button writes empty values instead of removing SSO config - PR #14826
- Admin Settings - Remove useful links from admin settings - PR #14918
- Management Routes - Add /user/list to management routes - PR #14868
Logging / Guardrail / Prompt Management Integrationsโ
Featuresโ
- DataDog
- Logging -
datadog
callback Log message content w/o sending to datadog - PR #14909
- Logging -
- Langfuse
- Adding langfuse usage details for cached tokens - PR #10955
- Opik
- Improve opik integration code - PR #14888
- SQS
- Error logging support for SQS Logger - PR #14974
Guardrailsโ
- LakeraAI v2 Guardrail - Ensure exception is raised correctly - PR #14867
- Presidio Guardrail - Support custom entity types in Presidio guardrail with Union[PiiEntityType, str] - PR #14899
- Noma Guardrail - Add noma guardrail provider to ui - PR #14415
Prompt Managementโ
- BitBucket Integration - Add BitBucket Integration for Prompt Management - PR #14882
Spend Tracking, Budgets and Rate Limitingโ
- Service Tier Pricing - Add service_tier based pricing support for openai (BOTH Service & Priority Support) - PR #14796
- Cost Tracking - Show input, output, tool call cost breakdown in StandardLoggingPayload - PR #14921
- Parallel Request Limiter v3
- Priority Reservation - Fix: Priority Reservation: keys without priority metadata receive higher priority than keys with explicit priority configurations - PR #14832
MCP Gatewayโ
- MCP Configuration - Enable custom fields in mcp_info configuration - PR #14794
- MCP Tools - Remove server_name prefix from list_tools - PR #14720
- OAuth Flow - Initial commit for v2 oauth flow - PR #14964
Performance / Loadbalancing / Reliability improvementsโ
- Memory Leak Fix - Fix InMemoryCache unbounded growth when TTLs are set - PR #14869
- Cache Performance - Fix: cache root cause - PR #14827
- Concurrency Fix - Fix concurrency/scaling when many Python threads do streaming using sync completions - PR #14816
- Performance Optimization - Fix: reduce get_deployment cost to O(1) - PR #14967
- Performance Optimization - Fix: remove slow string operation - PR #14955
- DB Connection Management - Fix: DB connection state retries - PR #14925
Documentation Updatesโ
- Provider Documentation - Fix docs for provider_specific_params.md - PR #14787
- Model References - Update model references from gemini-pro to gemini-2.5-pro - PR #14775
- Letta Guide - Add Letta Guide documentation - PR #14798
- README - Make the README document clearer - PR #14860
- Session Management - Update docs for session management availability - PR #14914
- Cost Documentation - Add documentation for additional cost-related keys in custom pricing - PR #14949
- Azure Passthrough - Add azure passthrough documentation - PR #14958
- General Documentation - Doc updates sept 2025 - PR #14769
- Clarified bridging between endpoints and mode in docs.
- Added Vertex AI Gemini API configuration as an alternative in relevant guides. Linked AWS authentication info in the Bedrock guardrails documentation.
- Added Cancel Response API usage with code snippets
- Clarified that SSO (Single Sign-On) is free for up to 5 users:
- Alphabetized sidebar, leaving quick start / intros at top of categories
- Documented max_connections under cache_params.
- Clarified IAM AssumeRole Policy requirements.
- Added transform utilities example to Getting Started (showing request transformation).
- Added references to models.litellm.ai as the full models list in various docs.
- Added a code snippet for async_post_call_success_hook.
- Removed broken links to callbacks management guide. - Reformatted and linked cookbooks + other relevant docs
- Documentation Corrections - Corrected docs updates sept 2025 - PR #14916
New Contributorsโ
- @uzaxirr made their first contribution in PR #14761
- @xprilion made their first contribution in PR #14416
- @CH-GAGANRAJ made their first contribution in PR #14779
- @otaviofbrito made their first contribution in PR #14778
- @danielmklein made their first contribution in PR #14639
- @Jetemple made their first contribution in PR #14826
- @akshoop made their first contribution in PR #14818
- @hazyone made their first contribution in PR #14821
- @leventov made their first contribution in PR #14816
- @fabriciojoc made their first contribution in PR #10955
- @onlylonly made their first contribution in PR #14845
- @Copilot made their first contribution in PR #14869
- @arsh72 made their first contribution in PR #14899
- @berri-teddy made their first contribution in PR #14914
- @vpbill made their first contribution in PR #14415
- @kgritesh made their first contribution in PR #14893
- @oytunkutrup1 made their first contribution in PR #14858
- @nherment made their first contribution in PR #14933
- @deepanshululla made their first contribution in PR #14974
- @TeddyAmkie made their first contribution in PR #14758
- @SmartManoj made their first contribution in PR #14775
- @uc4w6c made their first contribution in PR #14720
- @luizrennocosta made their first contribution in PR #14783
- @AlexsanderHamir made their first contribution in PR #14827
- @dharamendrak made their first contribution in PR #14721
- @TomeHirata made their first contribution in PR #14164
- @mrFranklin made their first contribution in PR #14860
- @luisfucros made their first contribution in PR #14866
- @huangyafei made their first contribution in PR #14879
- @thiswillbeyourgithub made their first contribution in PR #14949
- @Maximgitman made their first contribution in PR #14965
- @subnet-dev made their first contribution in PR #14938
- @22mSqRi made their first contribution in PR #14972