Baseten Blog | Page 8
Jina AI’s jina-embeddings-v2: an open source text embedding model that matches OpenAI’s ada-002
Jina AI released jina-embeddings-v2-base-en, a text embedding model that matches OpenAI’s ada-002 model in both benchmark performance and context window length.
New in September 2023
Mistral 7B LLM, GPU comparisons, model observability features, and an open source AI event series
NVIDIA A10 vs A100 GPUs for LLM and Stable Diffusion inference
This article compares two popular GPUs—the NVIDIA A10 and A100—for model inference and discusses the option of using multi-GPU instances for larger models.
New in August 2023
Truss' latest update addresses key ML model serving issues. Discover how to speed up SDXL inference to 3s and build ChatGPT-like apps with Llama 2 & Chainlit.
SDXL inference in under 2 seconds: the ultimate guide to Stable Diffusion optimization
SDXL 1.0 initially takes 8-10 seconds for a 1024x1024px image on A100 GPU. Learn how to reduce this to just 1.92 seconds on the same hardware.
Build your own open-source ChatGPT with Llama 2 and Chainlit
Llama 2 rivals GPT-3.5 in quality and powers ChatGPT. Chainlit helps build ChatGPT-like interfaces. This guide shows creating such interfaces with Llama 2.
AudioGen: deploy and build today!
AudioGen, part of the AudioCraft family of models from Meta AI, is now available in the Baseten model library.
New in July 2023
Llama 2 and SDXL shake up foundation model leaderboards (plus: Langchain, autoscaling, and more)
AI infrastructure: build vs. buy
AI infrastructure, ML infrastructure, build vs. buy, model deployment