large language

KimiKimi K2 Instruct

The world's first 1 trillion parameter open source model

Model details

Example usage

Baseten offers Dedicated Deployments and Model APIs for Kimi K2 powered by the Baseten Inference Stack.

Kimi K2 has shown strong performance on agentic tasks thanks to its tool calling, reasoning abilities, and long context handling. But as a large parameter model (1T parameters), it’s also resource-intensive. Running it in production requires a highly optimized inference stack to avoid excessive latency.

Kimi

Deployments of Kimi are OpenAI-compatible.

Input
1# You can use this model with any of the OpenAI clients in any language!
2# Simply change the API Key to get started
3
4from openai import OpenAI
5
6client = OpenAI(
7    api_key="YOUR_API_KEY",
8    base_url="https://inference.baseten.co/v1"
9)
10
11response = client.chat.completions.create(
12    model="moonshotai/Kimi-K2-Instruct-0905",
13    messages=[
14        {
15            "role": "user",
16            "content": "Implement Hello World in Python"
17        }
18    ],
19    stream=True,
20    stream_options={
21        "include_usage": True,
22        "continuous_usage_stats": True
23    },
24    top_p=1,
25    max_tokens=1000,
26    temperature=1,
27    presence_penalty=0,
28    frequency_penalty=0
29)
30
31for chunk in response:
32    if chunk.choices and chunk.choices[0].delta.content is not None:
33        print(chunk.choices[0].delta.content, end="", flush=True)
JSON output
1{
2    "id": "143",
3    "choices": [
4        {
5            "finish_reason": "stop",
6            "index": 0,
7            "logprobs": null,
8            "message": {
9                "content": "[Model output here]",
10                "role": "assistant",
11                "audio": null,
12                "function_call": null,
13                "tool_calls": null
14            }
15        }
16    ],
17    "created": 1741224586,
18    "model": "",
19    "object": "chat.completion",
20    "service_tier": null,
21    "system_fingerprint": null,
22    "usage": {
23        "completion_tokens": 145,
24        "prompt_tokens": 38,
25        "total_tokens": 183,
26        "completion_tokens_details": null,
27        "prompt_tokens_details": null
28    }
29}

large language models

See all
Qwen Logo
Model API
LLM

Qwen3 Coder 480B

3 - Coder
DeepSeek Logo
Model API
LLM

DeepSeek V3.1

V3.1 - B200
OpenAI logo
Model API
LLM

GPT OSS 120B

MoE

Moonshot AI models

See all
Kimi
Model API
LLM

Kimi K2 Thinking

Thinking - K2
Kimi
Model API
LLM

Kimi K2 Instruct

0905

🔥 Trending models