large language

KimiKimi K2

The world's first 1 trillion parameter open source model

Model details

Example usage

Baseten offers Dedicated Deployments and Model APIs for Kimi K2 powered by the Baseten Inference Stack.

Kimi K2 has shown strong performance on agentic tasks thanks to its tool calling, reasoning abilities, and long context handling. But as a large parameter model (1T parameters), it’s also resource-intensive. Running it in production requires a highly optimized inference stack to avoid excessive latency.

Kimi

Deployments of Kimi are OpenAI-compatible.

Input
1# You can use this model with any of the OpenAI clients in any language!
2# Simply change the API Key to get started
3
4from openai import OpenAI
5
6client = OpenAI(
7    api_key="YOUR_API_KEY",
8    base_url="https://inference.baseten.co/v1"
9)
10
11response = client.chat.completions.create(
12    model="moonshotai/Kimi-K2-Instruct",
13    messages=[
14        {
15            "role": "user",
16            "content": "Implement Hello World in Python"
17        }
18    ],
19    stop=[],
20    stream=True,
21    stream_options={
22        "include_usage": True,
23        "continuous_usage_stats": True
24    },
25    top_p=1,
26    max_tokens=1000,
27    temperature=1,
28    presence_penalty=0,
29    frequency_penalty=0
30)
31
32for chunk in response:
33    if chunk.choices and chunk.choices[0].delta.content is not None:
34        print(chunk.choices[0].delta.content, end="", flush=True)
JSON output
1{
2    "id": "143",
3    "choices": [
4        {
5            "finish_reason": "stop",
6            "index": 0,
7            "logprobs": null,
8            "message": {
9                "content": "[Model output here]",
10                "role": "assistant",
11                "audio": null,
12                "function_call": null,
13                "tool_calls": null
14            }
15        }
16    ],
17    "created": 1741224586,
18    "model": "",
19    "object": "chat.completion",
20    "service_tier": null,
21    "system_fingerprint": null,
22    "usage": {
23        "completion_tokens": 145,
24        "prompt_tokens": 38,
25        "total_tokens": 183,
26        "completion_tokens_details": null,
27        "prompt_tokens_details": null
28    }
29}

large language models

See all
Z AI
LLM

GLM-4.5 Air

4.5 - Air
Kimi
Model API
LLM

Kimi K2

V2
Qwen Logo
Model API
LLM

Qwen3 235B 2507

2507

Moonshot AI models

See all
Kimi
Model API
LLM

Kimi K2

V2

🔥 Trending models