Product | Page 2
New in February 2024
3x throughput with H100 GPUs, 40% lower SDXL latency with TensorRT, and multimodal open source models.
New in January 2024
A library for open source models, general availability for L4 GPUs, and performance benchmarking for ML inference
New in December 2023
Faster Mixtral inference, Playground v2 image generation, and ComfyUI pipelines as API endpoints.
New in November 2023
Switching to open source ML, a guide to model inference math, and Stability.ai's new generative AI image-to-video model.
New in October 2023
All-new model management, a text embedding model that matches OpenAI, and misgif, the most fun you’ll have with AI all week.
New in September 2023
Mistral 7B LLM, GPU comparisons, model observability features, and an open source AI event series
New in August 2023
Truss' latest update addresses key ML model serving issues. Discover how to speed up SDXL inference to 3s and build ChatGPT-like apps with Llama 2 & Chainlit.
New in July 2023
Llama 2 and SDXL shake up foundation model leaderboards (plus: Langchain, autoscaling, and more)
Model autoscaling features on Baseten
Scale replica count up and down in response to traffic, with scale to zero and fast cold starts.