Baseten Blog | Page 7
New in December 2023
Faster Mixtral inference, Playground v2 image generation, and ComfyUI pipelines as API endpoints.
Faster Mixtral inference with TensorRT-LLM and quantization
Mixtral 8x7B structurally has faster inference than similarly-powerful Llama 2 70B, but we can make it even faster using TensorRT-LLM and int8 quantization.
Playground v2 vs Stable Diffusion XL 1.0 for text-to-image generation
Playground v2, a new text-to-image model, matches SDXL's speed & quality with a unique AAA game-style aesthetic. Ideal choice varies by use case & art taste.
How to serve your ComfyUI model behind an API endpoint
This guide details deploying ComfyUI image generation pipelines via API for app integration, using Truss for packaging & production deployment.
New in November 2023
Switching to open source ML, a guide to model inference math, and Stability.ai's new generative AI image-to-video model.
NVIDIA A10 vs A10G for ML model inference
The A10, an Ampere-series GPU, excels in tasks like running 7B parameter LLMs. AWS's A10G variant, similar in GPU memory & bandwidth, is mostly interchangeable.
Stable Video Diffusion now available
Stability AI announced the release of Stable Video Diffusion, marking a huge leap forward for open source novel video synthesis
GPT vs Llama: Migrate to open source LLMs seamlessly
Use ChatCompletions API to test open-source LLMs like Llama in your AI app with just three minor code modifications.
Open source alternatives for machine learning models
Building on top of open source models gives you access to a wide range of capabilities that you would otherwise lack from a black box endpoint provider.