Qwen LogoQwen 2.5 72B Math Instruct

The largest model in the Qwen family of LLMs, tuned for math

Deploy Qwen 2.5 72B Math Instruct behind an API endpoint in seconds.

Deploy model

Example usage

Qwen uses the standard llama-style multi-turn messaging framework with system and user prompts.

Input
1import requests
2
3# Replace the empty string with your model id below
4model_id = ""
5baseten_api_key = os.environ["BASETEN_API_KEY"]
6
7data = {
8    "messages": [
9        {"role": "system", "content": "Please reason step by step, and put your final answer within \\boxed{}."},
10        {"role": "user", "content": "Find the value of $x$ that satisfies the equation $4x+5 = 6x+7$."},
11    ]
12    "stream": True,
13    "max_new_tokens": 512,
14    "temperature": 0.9
15}
16
17# Call model endpoint
18res = requests.post(
19    f"https://model-{model_id}.api.baseten.co/production/predict",
20    headers={"Authorization": f"Api-Key {baseten_api_key}"},
21    json=data,
22    stream=True
23)
24
25# Print the generated tokens as they get streamed
26for content in res.iter_content():
27    print(content.decode("utf-8"), end="", flush=True)
JSON output
1[
2    "streaming",
3    "output",
4    "text"
5]

Deploy any model in just a few commands

Avoid getting tangled in complex deployment processes. Deploy best-in-class open-source models and take advantage of optimized serving for your own models.

$

truss init -- example stable-diffusion-2-1-base ./my-sd-truss

$

cd ./my-sd-truss

$

export BASETEN_API_KEY=MdNmOCXc.YBtEZD0WFOYKso2A6NEQkRqTe

$

truss push

INFO

Serializing Stable Diffusion 2.1 truss.

INFO

Making contact with Baseten 👋 👽

INFO

🚀 Uploading model to Baseten 🚀

Upload progress: 0% | | 0.00G/2.39G