Deploy this version​
- Docker
- Pip
docker run litellm
docker run
-e STORE_MODEL_IN_DB=True
-p 4000:4000
ghcr.io/berriai/litellm:main-v1.66.0-stable
pip install litellm
pip install litellm==1.66.0.post1
v1.66.0-stable is live now, here are the key highlights of this release
Key Highlights​
- Microsoft SSO Auto-sync: Auto-sync groups and group members from Azure Entra ID to LiteLLM
- Unified File IDs: Use the same file id across LLM API providers.
- Realtime API Cost Tracking: Track cost of realtime API calls
- xAI grok-3: Added support for
xai/grok-3
models - Security Fixes: Fixed CVE-2025-0330 and CVE-2024-6825 vulnerabilities
Let's dive in.
Microsoft SSO Auto-sync​
Auto-sync groups and members from Azure Entra ID to LiteLLM
This release adds support for auto-syncing groups and members on Microsoft Entra ID with LiteLLM. This means that LiteLLM proxy administrators can spend less time managing teams and members and LiteLLM handles the following:
- Auto-create teams that exist on Microsoft Entra ID
- Sync team members on Microsoft Entra ID with LiteLLM teams
Get started with this here
Unified File ID​
New Models / Updated Models​
xAI
- Added reasoning_effort support for
xai/grok-3-mini-beta
Get Started - Added cost tracking for
xai/grok-3
models PR
- Added reasoning_effort support for
Hugging Face
- Added inference providers support Get Started
Azure
- Added azure/gpt-4o-realtime-audio cost tracking PR
VertexAI
- Added enterpriseWebSearch tool support Get Started
- Moved to only passing keys accepted by the Vertex AI response schema PR
Google AI Studio
Azure
Databricks
General
- Added litellm.supports_reasoning() util to track if an llm supports reasoning Get Started
- Function Calling - Handle pydantic base model in message tool calls, handle tools = [], and support fake streaming on tool calls for meta.llama3-3-70b-instruct-v1:0 PR
- LiteLLM Proxy - Allow passing
thinking
param to litellm proxy via client sdk PR - Fixed correctly translating 'thinking' param for litellm PR
Spend Tracking Improvements​
- OpenAI, Azure
- Realtime API Cost tracking with token usage metrics in spend logs Get Started
- Anthropic
- General
Management Endpoints / UI​
Test Key Tab:
- Added rendering of Reasoning content, ttft, usage metrics on test key page PR
View input, output, reasoning tokens, ttft metrics.
Tag / Policy Management:
- Added Tag/Policy Management. Create routing rules based on request metadata. This allows you to enforce that requests with
tags="private"
only go to specific models. Get Started
Create and manage tags.
- Added Tag/Policy Management. Create routing rules based on request metadata. This allows you to enforce that requests with
Redesigned Login Screen:
- Polished login screen PR
Microsoft SSO Auto-Sync:
- Added debug route to allow admins to debug SSO JWT fields PR
- Added ability to use MSFT Graph API to assign users to teams PR
- Connected litellm to Azure Entra ID Enterprise Application PR
- Added ability for admins to set
default_team_params
for when litellm SSO creates default teams PR - Fixed MSFT SSO to use correct field for user email PR
- Added UI support for setting Default Team setting when litellm SSO auto creates teams PR
UI Bug Fixes:
Logging / Guardrail Improvements​
- Prometheus:
- Emit Key and Team Budget metrics on a cron job schedule Get Started
Security Fixes​
- Fixed CVE-2025-0330 - Leakage of Langfuse API keys in team exception handling PR
- Fixed CVE-2024-6825 - Remote code execution in post call rules PR
Helm​
Demo​
Try this on the demo instance today
Complete Git Diff​
See the complete git diff since v1.65.4-stable, here