Baseten Blog | Page 3

News

Introducing Baseten Hybrid: control and flexibility in your cloud and ours

Baseten Hybrid is a multi-cloud solution that enables you to run inference in your cloud—with optional spillover into ours.

2 others
Glossary

Building high-performance compound AI applications with MongoDB Atlas and Baseten

Using MongoDB Atlas and Baseten’s Chains framework for compound AI, you can build high-performance compound AI systems.

Model performance

How to build function calling and JSON mode for open-source and fine-tuned LLMs

Use a state machine to generate token masks for logit biasing to enable function calling and structured output at the model server level.

News

Introducing function calling and structured output for open-source and fine-tuned LLMs

Add function calling and structured output capabilities to any open-source or fine-tuned large language model supported by TensorRT-LLM automatically.

ML models

The best open-source image generation model

Explore the strengths and weaknesses of state-of-the-art image generation models like FLUX.1, Stable Diffusion 3, SDXL Lightning, and Playground 2.5.

Model performance

How to double tokens per second for Llama 3 with Medusa

We observe up to a 122% increase in tokens per second for Llama 3 after training custom Medusa heads and running the updated model with TensorRT-LLM

1 other
Community

SPC hackathon winners build with Llama 3.1 on Baseten

SPC hackathon winner TestNinja and finalist VibeCheck used Baseten to power apps for test generation and mood board creation.

News

Introducing Baseten Self-hosted

Gain granular control over data locality, align with strict compliance standards, meet specific performance requirements, and more with Baseten Self-hosted.

Glossary

Compound AI systems explained

Compound AI systems combine multiple models and processing steps, and are forming the next generation of AI products.