v1.77.5-stable - MCP OAuth 2.0 Support

September 29, 2025

Krrish Dholakia

CEO, LiteLLM

Ishaan Jaff

CTO, LiteLLM

Alexsander Hamir

Backend Performance Engineer

Deploy this version

Docker
Pip

docker run litellm
docker run \
-e STORE_MODEL_IN_DB=True \
-p 4000:4000 \
ghcr.io/berriai/litellm:v1.77.5-stable

pip install litellm
pip install litellm==1.77.5

Key Highlights

MCP OAuth 2.0 Support - Enhanced authentication for Model Context Protocol integrations
Scheduled Key Rotations - Automated key rotation capabilities for enhanced security
New Gemini 2.5 Flash & Flash-lite Models - Latest September 2025 preview models with improved pricing and features
Performance Improvements - 54% RPS improvement

Scheduled Key Rotations

This release brings support for scheduling virtual key rotations on LiteLLM AI Gateway.

This is great for Proxy Admins looking to enforce Enterprise Grade security for use cases going through LiteLLM AI Gateway.

From this release you can enforce Virtual Keys to rotate on a schedule of your choice e.g every 15 days/30 days/60 days etc.

Performance Improvements - 54% RPS Improvement

This release brings a 54% RPS improvement (1,040 → 1,602 RPS, aggregated) per instance.

The improvement comes from fixing O(n²) inefficiencies in the LiteLLM Router, primarily caused by repeated use of in statements inside loops over large arrays.

Tests were run with a database-only setup (no cache hits).

Test Setup

All benchmarks were executed using Locust with 1,000 concurrent users and a ramp-up of 500. The environment was configured to stress the routing layer and eliminate caching as a variable.

System Specs

CPU: 8 vCPUs
Memory: 32 GB RAM

Configuration (config.yaml)

View the complete configuration: gist.github.com/AlexsanderHamir/config.yaml

Load Script (no_cache_hits.py)

View the complete load testing script: gist.github.com/AlexsanderHamir/no_cache_hits.py

New Models / Updated Models

New Model Support

Provider	Model	Context Window	Input ($/1M tokens)	Output ($/1M tokens)	Features
Gemini	`gemini-2.5-flash-preview-09-2025`	1M	$0.30	$2.50	Chat, reasoning, vision, audio
Gemini	`gemini-2.5-flash-lite-preview-09-2025`	1M	$0.10	$0.40	Chat, reasoning, vision, audio
Gemini	`gemini-flash-latest`	1M	$0.30	$2.50	Chat, reasoning, vision, audio
Gemini	`gemini-flash-lite-latest`	1M	$0.10	$0.40	Chat, reasoning, vision, audio
DeepSeek	`deepseek-chat`	131K	$0.60	$1.70	Chat, function calling, caching
DeepSeek	`deepseek-reasoner`	131K	$0.60	$1.70	Chat, reasoning
Bedrock	`deepseek.v3-v1:0`	164K	$0.58	$1.68	Chat, reasoning, function calling
Azure	`azure/gpt-5-codex`	272K	$1.25	$10.00	Responses API, reasoning, vision
OpenAI	`gpt-5-codex`	272K	$1.25	$10.00	Responses API, reasoning, vision
SambaNova	`sambanova/DeepSeek-V3.1`	33K	$3.00	$4.50	Chat, reasoning, function calling
SambaNova	`sambanova/gpt-oss-120b`	131K	$3.00	$4.50	Chat, reasoning, function calling
Bedrock	`qwen.qwen3-coder-480b-a35b-v1:0`	262K	$0.22	$1.80	Chat, reasoning, function calling
Bedrock	`qwen.qwen3-235b-a22b-2507-v1:0`	262K	$0.22	$0.88	Chat, reasoning, function calling
Bedrock	`qwen.qwen3-coder-30b-a3b-v1:0`	262K	$0.15	$0.60	Chat, reasoning, function calling
Bedrock	`qwen.qwen3-32b-v1:0`	131K	$0.15	$0.60	Chat, reasoning, function calling
Vertex AI	`vertex_ai/qwen/qwen3-next-80b-a3b-instruct-maas`	262K	$0.15	$1.20	Chat, function calling
Vertex AI	`vertex_ai/qwen/qwen3-next-80b-a3b-thinking-maas`	262K	$0.15	$1.20	Chat, function calling
Vertex AI	`vertex_ai/deepseek-ai/deepseek-v3.1-maas`	164K	$1.35	$5.40	Chat, reasoning, function calling
OpenRouter	`openrouter/x-ai/grok-4-fast:free`	2M	$0.00	$0.00	Chat, reasoning, function calling
XAI	`xai/grok-4-fast-reasoning`	2M	$0.20	$0.50	Chat, reasoning, function calling
XAI	`xai/grok-4-fast-non-reasoning`	2M	$0.20	$0.50	Chat, function calling

Features

Gemini
- Added Gemini 2.5 Flash and Flash-lite preview models (September 2025 release) with improved pricing - PR #14948
- Added new Anthropic web fetch tool support - PR #14951
XAI
- Add xai/grok-4-fast models - PR #14833
Anthropic
- Updated Claude Sonnet 4 configs to reflect million-token context window pricing - PR #14639
- Added supported text field to anthropic citation response - PR #14164
Bedrock
- Added support for Qwen models family & Deepseek 3.1 to Amazon Bedrock - PR #14845
- Support requestMetadata in Bedrock Converse API - PR #14570
Vertex AI
- Added vertex_ai/qwen models and azure/gpt-5-codex - PR #14844
- Update vertex ai qwen model pricing - PR #14828
- Vertex AI Context Caching: use Vertex ai API v1 instead of v1beta1 and accept 'cachedContent' param - PR #14831
SambaNova
- Add sambanova deepseek v3.1 and gpt-oss-120b - PR #14866
OpenAI
- Fix inconsistent token configs for gpt-5 models - PR #14942
- GPT-3.5-Turbo price updated - PR #14858
OpenRouter
- Add gpt-5 and gpt-5-codex to OpenRouter cost map - PR #14879
VLLM
- Fix vllm passthrough - PR #14778
Flux
- Support flux image edit - PR #14790

Bug Fixes

Anthropic
- Fix: Support claude code auth via subscription (anthropic) - PR #14821
- Fix Anthropic streaming IDs - PR #14965
- Revert incorrect changes to sonnet-4 max output tokens - PR #14933
OpenAI
- Fix a bug where openai image edit silently ignores multiple images - PR #14893
VLLM
- Fix: vLLM provider's rerank endpoint from /v1/rerank to /rerank - PR #14938

New Provider Support

W&B Inference
- Add W&B Inference to LiteLLM - PR #14416

LLM API Endpoints

Features

General
- Add SDK support for additional headers - PR #14761
- Add shared_session parameter for aiohttp ClientSession reuse - PR #14721

Bugs

General
- Fix: Streaming tool call index assignment for multiple tool calls - PR #14587
- Fix load credentials in token counter proxy - PR #14808

Management Endpoints / UI

Features

Proxy CLI Auth
- Allow re-using cli auth token - PR #14780
- Create a python method to login using litellm proxy - PR #14782
- Fixes for LiteLLM Proxy CLI to Auth to Gateway - PR #14836

Virtual Keys

Initial support for scheduled key rotations - PR #14877
Allow scheduling key rotations when creating virtual keys - PR #14960

Models + Endpoints

Fix: added Oracle to provider's list - PR #14835

Bugs

SSO - Fix: SSO "Clear" button writes empty values instead of removing SSO config - PR #14826
Admin Settings - Remove useful links from admin settings - PR #14918
Management Routes - Add /user/list to management routes - PR #14868

Logging / Guardrail / Prompt Management Integrations

Features

DataDog
- Logging - datadog callback Log message content w/o sending to datadog - PR #14909
Langfuse
- Adding langfuse usage details for cached tokens - PR #10955
Opik
- Improve opik integration code - PR #14888
SQS
- Error logging support for SQS Logger - PR #14974

Guardrails

LakeraAI v2 Guardrail - Ensure exception is raised correctly - PR #14867
Presidio Guardrail - Support custom entity types in Presidio guardrail with Union[PiiEntityType, str] - PR #14899
Noma Guardrail - Add noma guardrail provider to ui - PR #14415

Prompt Management

BitBucket Integration - Add BitBucket Integration for Prompt Management - PR #14882

Spend Tracking, Budgets and Rate Limiting

Service Tier Pricing - Add service_tier based pricing support for openai (BOTH Service & Priority Support) - PR #14796
Cost Tracking - Show input, output, tool call cost breakdown in StandardLoggingPayload - PR #14921
Parallel Request Limiter v3
- Ensure Lua scripts can execute on redis cluster - PR #14968
- Fix: get metadata info from both metadata and litellm_metadata fields - PR #14783
Priority Reservation - Fix: Priority Reservation: keys without priority metadata receive higher priority than keys with explicit priority configurations - PR #14832

MCP Gateway

MCP Configuration - Enable custom fields in mcp_info configuration - PR #14794
MCP Tools - Remove server_name prefix from list_tools - PR #14720
OAuth Flow - Initial commit for v2 oauth flow - PR #14964

Performance / Loadbalancing / Reliability improvements

Memory Leak Fix - Fix InMemoryCache unbounded growth when TTLs are set - PR #14869
Cache Performance - Fix: cache root cause - PR #14827
Concurrency Fix - Fix concurrency/scaling when many Python threads do streaming using sync completions - PR #14816
Performance Optimization - Fix: reduce get_deployment cost to O(1) - PR #14967
Performance Optimization - Fix: remove slow string operation - PR #14955
DB Connection Management - Fix: DB connection state retries - PR #14925

Documentation Updates

Provider Documentation - Fix docs for provider_specific_params.md - PR #14787
Model References - Update model references from gemini-pro to gemini-2.5-pro - PR #14775
Letta Guide - Add Letta Guide documentation - PR #14798
README - Make the README document clearer - PR #14860
Session Management - Update docs for session management availability - PR #14914
Cost Documentation - Add documentation for additional cost-related keys in custom pricing - PR #14949
Azure Passthrough - Add azure passthrough documentation - PR #14958
General Documentation - Doc updates sept 2025 - PR #14769
- Clarified bridging between endpoints and mode in docs.
- Added Vertex AI Gemini API configuration as an alternative in relevant guides. Linked AWS authentication info in the Bedrock guardrails documentation.
- Added Cancel Response API usage with code snippets
- Clarified that SSO (Single Sign-On) is free for up to 5 users:
- Alphabetized sidebar, leaving quick start / intros at top of categories
- Documented max_connections under cache_params.
- Clarified IAM AssumeRole Policy requirements.
- Added transform utilities example to Getting Started (showing request transformation).
- Added references to models.litellm.ai as the full models list in various docs.
- Added a code snippet for async_post_call_success_hook.
- Removed broken links to callbacks management guide. - Reformatted and linked cookbooks + other relevant docs
Documentation Corrections - Corrected docs updates sept 2025 - PR #14916

New Contributors

@uzaxirr made their first contribution in PR #14761
@xprilion made their first contribution in PR #14416
@CH-GAGANRAJ made their first contribution in PR #14779
@otaviofbrito made their first contribution in PR #14778
@danielmklein made their first contribution in PR #14639
@Jetemple made their first contribution in PR #14826
@akshoop made their first contribution in PR #14818
@hazyone made their first contribution in PR #14821
@leventov made their first contribution in PR #14816
@fabriciojoc made their first contribution in PR #10955
@onlylonly made their first contribution in PR #14845
@Copilot made their first contribution in PR #14869
@arsh72 made their first contribution in PR #14899
@berri-teddy made their first contribution in PR #14914
@vpbill made their first contribution in PR #14415
@kgritesh made their first contribution in PR #14893
@oytunkutrup1 made their first contribution in PR #14858
@nherment made their first contribution in PR #14933
@deepanshululla made their first contribution in PR #14974
@TeddyAmkie made their first contribution in PR #14758
@SmartManoj made their first contribution in PR #14775
@uc4w6c made their first contribution in PR #14720
@luizrennocosta made their first contribution in PR #14783
@AlexsanderHamir made their first contribution in PR #14827
@dharamendrak made their first contribution in PR #14721
@TomeHirata made their first contribution in PR #14164
@mrFranklin made their first contribution in PR #14860
@luisfucros made their first contribution in PR #14866
@huangyafei made their first contribution in PR #14879
@thiswillbeyourgithub made their first contribution in PR #14949
@Maximgitman made their first contribution in PR #14965
@subnet-dev made their first contribution in PR #14938
@22mSqRi made their first contribution in PR #14972

Deploy this version​

Key Highlights​

Scheduled Key Rotations​

Performance Improvements - 54% RPS Improvement​

Test Setup​

New Models / Updated Models​

New Model Support​

Features​

Bug Fixes​

New Provider Support​

LLM API Endpoints​

Features​

Bugs​

Management Endpoints / UI​

Features​

Bugs​

Logging / Guardrail / Prompt Management Integrations​

Features​

Guardrails​

Prompt Management​

Spend Tracking, Budgets and Rate Limiting​

MCP Gateway​

Performance / Loadbalancing / Reliability improvements​

Documentation Updates​

New Contributors​

Full Changelog​

Deploy this version

Key Highlights

Scheduled Key Rotations

Performance Improvements - 54% RPS Improvement

Test Setup

New Models / Updated Models

New Model Support

Features

Bug Fixes

New Provider Support

LLM API Endpoints

Features

Bugs

Management Endpoints / UI

Features

Bugs

Logging / Guardrail / Prompt Management Integrations

Features

Guardrails

Prompt Management

Spend Tracking, Budgets and Rate Limiting

MCP Gateway

Performance / Loadbalancing / Reliability improvements

Documentation Updates

New Contributors

Full Changelog