这个能力做什么

通过 SSH 一键把 GPT OSS 20B 大模型部署到 NVIDIA Jetson 设备上。部署完成后，容器自动启动 llama-server，在 8080 端口提供兼容 OpenAI 格式的 HTTP 推理服务。

输出接口

接口类型	说明	端口/路径	数据格式
HTTP API	兼容 OpenAI 的对话补全接口	:8080/v1/chat/completions	JSON

部署完成后，打开浏览器访问：

http://<jetson-ip>:8080

适用集成场景

作为本地 AI 对话后端，接入聊天机器人或语音助手
配合 OpenClaw 网关，同时对接微信、Telegram 等多个聊天平台
为边缘设备提供离线大模型推理能力，无需云端 API

使用须知

硬件要求：

Jetson Orin NX 16GB 或更高配置（20B 模型需要约 12-15GB 显存）
reComputer J4012 已验证可用，其他 Jetson Orin 机型需确认显存充足

调用方式：

API 地址：http://<jetson-ip>:8080/v1/chat/completions
兼容 OpenAI 格式，可直接用现有 SDK 调用
Python 示例：import openai; openai.api_base = "http://<jetson-ip>:8080/v1"

首次请求延迟：

部署完成后首次调用可能需要等待 2-5 分钟（模型加载预热）
可访问 http://<jetson-ip>:8080/v1/models 检查服务是否就绪
预热完成后，后续请求响应较快（通常 1-3 秒）

Token 与上下文：

默认上下文窗口约 2048 tokens，可在部署时调整
如需更大上下文，可在配置中调高 Llama Context 参数（会占用更多显存）
单次请求建议控制在 1000 tokens 以内，避免显存溢出

技术规格

指标	数值
模型	GPT OSS 20B
推理框架	llama.cpp (llama-server)
支持硬件	reComputer J4012 (Jetson Orin NX 16GB)
服务端口	8080
API 格式	OpenAI 兼容

Preset: Jetson GPT OSS 20B Service {#jetson_got_oss}

Deploy GPT OSS 20B to your Jetson device with one click from this platform.

Device	Purpose
NVIDIA Jetson (reComputer)	Runs GPT OSS 20B in Docker

Step 1: Deploy GPT OSS 20B Service {#deploy_got_oss type=docker_deploy required=true config=devices/jetson.yaml}

Deploy the containerized GPT OSS 20B runtime to your Jetson over SSH.

部署目标: 远程部署（Jetson） {#jetson_remote type=remote config=devices/jetson.yaml default=true}

Deploy to your Jetson over SSH with one click.

Wiring

Connect Jetson and your computer to the same network.
Fill in Jetson IP, SSH username, and password.
Click Deploy.

Deployment Complete

The GPT OSS 20B container is running on your Jetson.
llama-server is started inside the container.
The service endpoint is available at http://<jetson-ip>:8080.
Readiness endpoint is available at http://<jetson-ip>:8080/v1/models.

Troubleshooting

Issue	Solution
SSH connection failed	Verify Jetson IP, username, password, and SSH service status
Docker runtime check failed	Ensure Docker is installed and NVIDIA runtime is available
Docker Compose unavailable	Ensure `docker compose` or `docker-compose` is installed
Service start failed	Inspect logs on Jetson: `docker compose logs --tail=200`
`503 {"message":"Loading model"}` on `/v1/models`	Model is still warming up; first run can take several minutes
Out-of-memory at startup	Reduce settings, for example set `Llama NGL=16` and `Llama Context=512`

部署目标: 本机部署 {#jetson_local type=local config=devices/jetson_local.yaml}

直接在当前机器上部署（需要具备足够显存的 NVIDIA GPU）。

接线

确保已安装 Docker 和 NVIDIA Container Toolkit
点击部署开始安装

提示： 首次启动可能需要 15-30 分钟下载 Docker 镜像和加载模型。需要至少 20GB 可用磁盘空间。

部署完成

在浏览器打开 http://localhost:8080
你将看到 GPT OSS 聊天界面，可随时开始对话

故障排查

问题	解决方法
找不到 NVIDIA 运行时	安装 NVIDIA Container Toolkit：`sudo apt install nvidia-container-toolkit && sudo systemctl restart docker`
端口 8080 已被占用	停止该端口上的其他服务
容器反复重启	查看日志：`docker compose logs --tail=200`
GPU 显存不足	20B 模型需要较大显存。可尝试使用更小的模型变体

Step 2: Open Service Link {#preview_service type=preview required=false config=devices/preview.yaml}

Use this step to open the Jetson service URL directly in a new browser tab.

Wiring

Enter Jetson IP in this step.
Click Connect.
The platform opens http://<jetson-ip>:8080 in a new tab.

Deployment Complete

The service page opens in your browser.
You can return here and click Connect again to reopen it.

Troubleshooting

Issue	Solution
Invalid host input	Enter a valid IP or hostname, for example `192.168.1.100`
New tab not opened	Allow pop-ups for this site and retry
Service page not reachable	Confirm Jetson service is listening on `8080` and network is reachable

Deployment Complete

GPT OSS 20B runtime has been deployed successfully on your Jetson.

Validation Checklist

Step 1 deployment status shows success.
The GPT OSS 20B container stays in running state.
Clicking Connect in Step 2 opens http://<jetson-ip>:8080.

Jetson 部署 GPT OSS 20B

这个能力做什么

输出接口

适用集成场景

使用须知

技术规格

集成接口

部署方案

Jetson GPT OSS 20B 推理服务

下载与安装

Preset: Jetson GPT OSS 20B Service {#jetson_got_oss}

Step 1: Deploy GPT OSS 20B Service {#deploy_got_oss type=docker_deploy required=true config=devices/jetson.yaml}

部署目标: 远程部署（Jetson） {#jetson_remote type=remote config=devices/jetson.yaml default=true}

Wiring

Deployment Complete

Troubleshooting

部署目标: 本机部署 {#jetson_local type=local config=devices/jetson_local.yaml}

接线

部署完成

故障排查

Step 2: Open Service Link {#preview_service type=preview required=false config=devices/preview.yaml}

Wiring

Deployment Complete

Troubleshooting

Deployment Complete

Validation Checklist