Baseten is now fully OpenAI compatible
The OpenAI SDK has become a standard for interacting with AI models, making it extremely important in the inference space. We’re happy to announce official OpenAI-compatible APIs for both chat...
See our latest feature releases, product improvements and bug fixes
Mar 21, 2025
The OpenAI SDK has become a standard for interacting with AI models, making it extremely important in the inference space. We’re happy to announce official OpenAI-compatible APIs for both chat...
Feb 10, 2025
Now with improved performance, robustness, and an even more delightful DevEx since our beta launch, we’re thrilled to announce the general availability of Baseten Chains for production compound AI!...
Jan 30, 2025
We run health checks on your deployments to ensure they’re able to run inference. Now, you can customize these checks to monitor anything , from tracking 500 errors to detecting CUDA issues and more....
Jan 21, 2025
We've expanded our metrics support to include GPU memory usage and utilization for MIG (Multi-Instance GPU) instance types. These metrics were previously unavailable for MIG configurations. This...
Dec 20, 2024
We’ve revamped our metrics dashboard to make monitoring and debugging easier! Here’s what’s new: Unified view : All metrics are now displayed on a single page—no more clicking between tabs. This...
Dec 19, 2024
Our new Speculative Decoding integration lets you leverage speculative decoding as part of our streamlined TensorRT-LLM Engine Builder flow. Just modify the new speculator configuration in the Engine...
Dec 13, 2024
We’ve added several new endpoints to our REST API, giving you even more control over your deployments, environments, and resources. Here’s what’s new: Deletion Endpoints Delete a model:...
Dec 13, 2024
Our async inference service now supports delivering async predict results to your webhook endpoints over HTTP/2. This means faster, more efficient connections for your webhook integrations. Don’t...
Dec 6, 2024
Get more visibility into activity across your workspace, models, and Chains with the new Activity Feed ! Click the Activity tab to view a detailed list of changes, including who made them and when....
Dec 6, 2024
[No action needed] As of truss version 0.9.55, the flag --trusted in truss push is no longer needed to use secrets in your deployed models. Secrets specified in your config.yaml will automatically be...