Baseten Blog | Page 7

GPU guides

NVIDIA A10 vs A100 GPUs for LLM and Stable Diffusion inference

This article compares two popular GPUs—the NVIDIA A10 and A100—for model inference and discusses the option of using multi-GPU instances for larger models.

Product

New in August 2023

Truss' latest update addresses key ML model serving issues. Discover how to speed up SDXL inference to 3s and build ChatGPT-like apps with Llama 2 & Chainlit.

Model performance

SDXL inference in under 2 seconds: the ultimate guide to Stable Diffusion optimization

SDXL 1.0 initially takes 8-10 seconds for a 1024x1024px image on A100 GPU. Learn how to reduce this to just 1.92 seconds on the same hardware.

Hacks & projects

Build your own open-source ChatGPT with Llama 2 and Chainlit

Llama 2 rivals GPT-3.5 in quality and powers ChatGPT. Chainlit helps build ChatGPT-like interfaces. This guide shows creating such interfaces with Llama 2.

ML models

AudioGen: deploy and build today!

AudioGen, part of the AudioCraft family of models from Meta AI, is now available in the Baseten model library.

Product

New in July 2023

Llama 2 and SDXL shake up foundation model leaderboards (plus: Langchain, autoscaling, and more)

Glossary

AI infrastructure: build vs. buy

AI infrastructure, ML infrastructure, build vs. buy, model deployment

Hacks & projects

Build a chatbot with Llama 2 and LangChain

Build a ChatGPT-style chatbot with open-source Llama 2 and LangChain in a Python notebook.

ML models

Deploying and using Stable Diffusion XL 1.0

Deploy Stable Diffusion XL 1.0 for free to generate images from text prompts and invoke Stable Diffusion with the Baseten Python client.

167812