Llama 3.3 Nemotron 49B Super - NVIDIA NIM

A high-efficiency model with leading accuracy for reasoning, tool calling, chat, and instruction following.

Deploy Llama 3.3 Nemotron 49B Super - NVIDIA NIM behind an API endpoint in seconds.

Llama 3.3 Nemotron 49B Super is an NVIDIA NIM large language model (LLM) derived from Llama 3.3 70B Instruct. It can be deployed with early access on Baseten.

Llama 3.3 Nemotron is a reasoning model post-trained for enterprise AI agent use cases, including reasoning, tool calling, chat, and instruction following tasks with a 128k token context length.

Deploy any model in just a few commands

Avoid getting tangled in complex deployment processes. Deploy best-in-class open-source models and take advantage of optimized serving for your own models.

Start deploying

truss init -- example stable-diffusion-2-1-base ./my-sd-truss

cd ./my-sd-truss

export BASETEN_API_KEY=MdNmOCXc.YBtEZD0WFOYKso2A6NEQkRqTe

truss push

INFO

Serializing Stable Diffusion 2.1 truss.

INFO

Making contact with Baseten 👋 👽

INFO

🚀 Uploading model to Baseten 🚀

Upload progress: 0% | | 0.00G/2.39G