Product
Baseten Chains is now GA for production compound AI systems
Baseten Chains delivers ultra-low-latency compound AI at scale, with custom hardware per model and simplified model orchestration.
Product
New observability features: activity logging, LLM metrics, and metrics dashboard customization
We added three new observability features for improved monitoring and debugging: an activity log, LLM metrics, and customizable metrics dashboards.
Product
Baseten Chains explained: building multi-component AI workflows at scale
A Delightful Developer Experience for Building and Deploying Compound ML Inference Workflows
News
Introducing Baseten Chains
Learn about Baseten's new Chains framework for deploying complex ML inference workflows across compound AI systems using multiple models and components
Glossary
Why GPU utilization matters for model inference
Save money on high-traffic model inference workloads by increasing GPU utilization to maximize performance per dollar for LLMs, SDXL, Whisper, and more.