Marius Killinger

Baseten Chains is now GA for production compound AI systems

Baseten Chains delivers ultra-low-latency compound AI at scale, with custom hardware per model and simplified model orchestration.

Marius Killinger

2 others

Baseten Chains delivers ultra-low-latency, scalable compound AI with custom hardware per model and seamless model orchestration.

Product

New observability features: activity logging, LLM metrics, and metrics dashboard customization

We added three new observability features for improved monitoring and debugging: an activity log, LLM metrics, and customizable metrics dashboards.

Suren Atoyan

4 others

Introducing three new observability features on Baseten: the activity log, LLM metrics, and customizable metrics dashboards

Product

Baseten Chains explained: building multi-component AI workflows at scale

A Delightful Developer Experience for Building and Deploying Compound ML Inference Workflows

Marius Killinger

1 other

News

Introducing Baseten Chains

Learn about Baseten's new Chains framework for deploying complex ML inference workflows across compound AI systems using multiple models and components

Bola Malek

4 others

Glossary

Why GPU utilization matters for model inference

Save money on high-traffic model inference workloads by increasing GPU utilization to maximize performance per dollar for LLMs, SDXL, Whisper, and more.

Marius Killinger

1 other

Prompt: A retrofuturistic pickup truck loaded with green plants on a sunny highway

‌

‌
‌
‌

‌

‌
‌
‌

‌

‌
‌
‌

‌

‌
‌
‌

‌

Machine learning infrastructure that just works

Baseten provides all the infrastructure you need to deploy and serve ML models performantly, scalable, and cost-efficiently.