"Inference Engineering" is now available. Get your copy here

Changelog

See our latest feature releases, product improvements and bug fixes

Mar 31, 2026

Health check improvements

Startup probes now handle initialization more reliably by waiting until the model has loaded before executing any liveness checks. The startup phase still defaults to 30 minutes and can be configured...

Mar 30, 2026

Rolling deployments

You can now gradually shift traffic to new deployments instead of swapping all at once. Candidate replicas scale up incrementally while previous replicas scale down in controlled steps, giving you...

Mar 27, 2026

Hot reload for development deployments

truss watch and truss push --watch now support hot-reloading model code changes with the --hot-reload and --watch-hot-reload flags. Instead of restarting the inference server, hot reload swaps your...

Mar 27, 2026

Terminate deployment replica via API

You can now terminate a specific replica within a deployment using the new management API endpoint. This lets you remove individual replicas without affecting the rest of the deployment, making it...

Mar 26, 2026

Observability improvements

We've redesigned the logs and metrics views for better visibility and faster debugging.

Mar 23, 2026

Model API deprecation (Kimi K2 0905, Kimi K2 Thinking, DeepSeek v3.2)

The Kimi K2 0905, Kimi K2 Thinking, and DeepSeek v3.2 Model API(s) were deprecated at 5pm PT March 6th. The model ID is currently inactive and will return an error for all requests.

Mar 19, 2026

Introducing the Baseten Delivery Network (BDN)

We just launched the Baseten Delivery Network (BDN), designed to make cold starts 2-3x faster for large models.

Mar 16, 2026

Regional environments

Route inference traffic exclusively within a designated geographic region to meet data residency and compliance requirements like GDPR.

Mar 13, 2026

CI/CD for model deployments

Automate Truss deployments with the Truss Push Action. Deploy on merge, validate on pull request, or deploy multiple models in parallel.

Mar 7, 2026

Truss 0.15.2

Added --no-cache flag to truss push to force a full rebuild without using cached Docker layers. This is useful when debugging build issues or ensuring a clean image. The flag is CLI-only and cannot...