拆分前后端,后端用Python重写

This commit is contained in:
lengbone 2026-03-28 16:32:51 +08:00
parent a5ee31a8d0
commit 05c934deba
114 changed files with 548 additions and 21 deletions

View File

@ -1,7 +1,7 @@
# 交互式AIGC场景 AIGC Demo
此 Demo 为简化版本, 如您有 1.5.x 版本 UI 的诉求, 可切换至 1.5.1 分支。
跑通阶段时, 无须关心代码实现,仅需按需完成 `Server/scenes/*.json` 的场景信息填充即可。
跑通阶段时, 无须关心代码实现,仅需按需完成 `backend/scenes/*.json` 的场景信息填充即可。
## 简介
- 在 AIGC 对话场景下,火山引擎 AIGC-RTC Server 云端服务,通过整合 RTC 音视频流处理ASR 语音识别,大模型接口调用集成,以及 TTS 语音生成等能力提供基于流式语音的端到端AIGC能力链路。
@ -9,16 +9,20 @@
- 同时火山引擎 RTC拥有成熟的音频 3A 处理、视频处理等技术以及大规模音视频聊天能力,可支持 AIGC 产品更便捷的支持多模态交互、多人互动等场景能力,保持交互的自然性和高效性。
## 【必看】环境准备
**Node 版本: 16.0+**
> 本项目已重构为 monorepo 结构,前端位于 `frontend/`Python 后端位于 `backend/`
**前端环境Node 16.0+**
**后端环境Python 3.9+**
### 1. 运行环境
需要准备两个 Terminal分别启动服务端和前端页面。
需要准备两个 Terminal分别启动后端服务和前端页面。
### 2. 服务开通
开通 ASR、TTS、LLM、RTC 等服务,可参考 [开通服务](https://www.volcengine.com/docs/6348/1315561?s=g) 进行相关服务的授权与开通。
### 3. 场景配置
`Server/scenes/*.json`
`backend/scenes/*.json`
您可以自定义具体场景, 并按需根据模版填充 `SceneConfig`、`AccountConfig`、`RTCConfig`、`VoiceChat` 中需要的参数。
@ -35,27 +39,18 @@ Demo 中以 `Custom` 场景为例,您可以自行新增场景。
- 可通过 [快速跑通 Demo](https://console.volcengine.com/rtc/aigc/run?s=g) 快速获取参数, 跑通后点击右上角 `接入 API` 按钮复制相关代码贴到 JSON 配置文件中即可。
## 快速开始
请注意,服务端和 Web 端都需要启动, 启动步骤如下:
### 服务端
进到项目根目录
#### 安装依赖
### 后端服务Python FastAPI
```shell
cd Server
yarn
```
#### 运行项目
```shell
yarn dev
cd backend
pip install -r requirements.txt
uvicorn main:app --host 0.0.0.0 --port 3001 --reload
```
### 前端页面
进到项目根目录
#### 安装依赖
```shell
yarn
```
#### 运行项目
```shell
yarn dev
cd frontend
npm install
npm run dev
```
### 常见问题
@ -67,7 +62,7 @@ yarn dev
| **[StartVoiceChat]Failed(Reason: The task has been started. Please do not call the startup task interface repeatedly.)** 报错 | 如果设置的 RoomId、UserId 为固定值,重复调用 startAgent 会导致出错,只需先调用 stopAgent 后再重新 startAgent 即可。 |
| 为什么麦克风、摄像头开启失败?浏览器报了`TypeError: Cannot read properties of undefined (reading 'getUserMedia')` | 检查当前页面是否为[安全上下文](https://developer.mozilla.org/zh-CN/docs/Web/Security/Secure_Contexts)(简单来说,检查当前页面是否为 `localhost` 或者 是否为 https 协议)。浏览器[限制](https://developer.mozilla.org/zh-CN/docs/Web/Security/Secure_Contexts/features_restricted_to_secure_contexts) `getUserMedia` 只能在安全上下文中使用。 |
| 为什么我的麦克风正常、摄像头也正常,但是设备没有正常工作? | 可能是设备权限未授予,详情可参考 [Web 排查设备权限获取失败问题](https://www.volcengine.com/docs/6348/1356355?s=g)。 |
| 接口调用时, 返回 "Invalid 'Authorization' header, Pls check your authorization header" 错误 | `Server/app.js` 中的 AK/SK 不正确 |
| 接口调用时, 返回 "Invalid 'Authorization' header, Pls check your authorization header" 错误 | `backend/scenes/*.json` 中的 AK/SK 不正确 |
| 什么是 RTC | **R**eal **T**ime **C**ommunication, RTC 的概念可参考[官网文档](https://www.volcengine.com/docs/6348/66812?s=g)。 |
| 不清楚什么是主账号,什么是子账号 | 可以参考[官方概念](https://www.volcengine.com/docs/6257/64963?hyperlink_open_type=lark.open_in_browser&s=g) 。|
| 我有自己的服务端了, 我应该怎么让前端调用我的服务端呢 | 修改 `src/config/index.ts` 中的 `AIGC_PROXY_HOST` 请求域名和接口并在 `src/app/api.ts` 中修改接口参数配置 `APIS_CONFIG` |

49
backend/README.md Normal file
View File

@ -0,0 +1,49 @@
# AIGC Backend (Python FastAPI)
原 Node.js + Koa 服务的 Python 重写版本,使用 FastAPI 框架。
## 环境要求
- Python 3.9+
## 安装依赖
```shell
pip install -r requirements.txt
```
## 场景配置
编辑 `scenes/*.json`,填写以下字段:
| 字段 | 说明 |
|------|------|
| `AccountConfig.accessKeyId` | 火山引擎 AK从 https://console.volcengine.com/iam/keymanage/ 获取 |
| `AccountConfig.secretKey` | 火山引擎 SK |
| `RTCConfig.AppId` | RTC 应用 ID |
| `RTCConfig.AppKey` | RTC 应用 Key用于自动生成 Token |
| `VoiceChat.*` | AIGC 相关配置,参考 https://www.volcengine.com/docs/6348/1558163 |
## 启动服务
```shell
uvicorn main:app --host 0.0.0.0 --port 3001 --reload
```
服务启动后监听 `http://localhost:3001`
## 接口说明
### POST /getScenes
返回所有场景列表,自动生成 RoomId/UserId/Token若未在 JSON 中配置)。
### POST /proxy?Action={Action}&Version={Version}
代理转发至火山引擎 RTC OpenAPI。
支持的 Action
- `StartVoiceChat` — 启动语音对话
- `StopVoiceChat` — 停止语音对话
请求体需包含 `SceneID` 字段,对应 `scenes/` 目录下的 JSON 文件名(不含扩展名)。

190
backend/main.py Normal file
View File

@ -0,0 +1,190 @@
"""
Copyright 2025 Beijing Volcano Engine Technology Co., Ltd. All Rights Reserved.
SPDX-license-identifier: BSD-3-Clause
FastAPI backend migrated from Server/app.js (Node.js + Koa)
"""
import json
import os
import time
import uuid
from pathlib import Path
import httpx
from fastapi import FastAPI, Request
from fastapi.middleware.cors import CORSMiddleware
from fastapi.responses import JSONResponse
from signer import Signer
from token import AccessToken, privileges
app = FastAPI()
app.add_middleware(
CORSMiddleware,
allow_origins=["*"],
allow_methods=["*"],
allow_headers=["*"],
)
SCENES_DIR = Path(__file__).parent / "scenes"
def load_scenes() -> dict:
scenes = {}
for p in SCENES_DIR.glob("*.json"):
with open(p, encoding="utf-8") as f:
scenes[p.stem] = json.load(f)
return scenes
Scenes = load_scenes()
def assert_value(value, msg: str):
if not value or (isinstance(value, str) and " " in value):
raise ValueError(msg)
def error_response(action: str, message: str):
return JSONResponse({
"ResponseMetadata": {
"Action": action,
"Error": {"Code": -1, "Message": message},
}
})
@app.post("/proxy")
async def proxy(request: Request):
action = request.query_params.get("Action", "")
version = request.query_params.get("Version", "2024-12-01")
try:
assert_value(action, "Action 不能为空")
assert_value(version, "Version 不能为空")
body = await request.json()
scene_id = body.get("SceneID", "")
assert_value(scene_id, "SceneID 不能为空SceneID 用于指定场景的 JSON")
json_data = Scenes.get(scene_id)
if not json_data:
raise ValueError(f"{scene_id} 不存在,请先在 backend/scenes 下定义该场景的 JSON.")
voice_chat = json_data.get("VoiceChat", {})
account_config = json_data.get("AccountConfig", {})
assert_value(account_config.get("accessKeyId"), "AccountConfig.accessKeyId 不能为空")
assert_value(account_config.get("secretKey"), "AccountConfig.secretKey 不能为空")
if action == "StartVoiceChat":
req_body = voice_chat
elif action == "StopVoiceChat":
app_id = voice_chat.get("AppId", "")
room_id = voice_chat.get("RoomId", "")
task_id = voice_chat.get("TaskId", "")
assert_value(app_id, "VoiceChat.AppId 不能为空")
assert_value(room_id, "VoiceChat.RoomId 不能为空")
assert_value(task_id, "VoiceChat.TaskId 不能为空")
req_body = {"AppId": app_id, "RoomId": room_id, "TaskId": task_id}
else:
req_body = {}
request_data = {
"region": "cn-north-1",
"method": "POST",
"params": {"Action": action, "Version": version},
"headers": {
"Host": "rtc.volcengineapi.com",
"Content-type": "application/json",
},
"body": req_body,
}
signer = Signer(request_data, "rtc")
signer.add_authorization(account_config)
async with httpx.AsyncClient() as client:
resp = await client.post(
f"https://rtc.volcengineapi.com?Action={action}&Version={version}",
headers=request_data["headers"],
json=req_body,
)
return JSONResponse(resp.json())
except ValueError as e:
return error_response(action, str(e))
except Exception as e:
return error_response(action, str(e))
@app.post("/getScenes")
async def get_scenes():
try:
scenes_list = []
for scene_name, data in Scenes.items():
scene_config = data.get("SceneConfig", {})
rtc_config = data.get("RTCConfig", {})
voice_chat = data.get("VoiceChat", {})
app_id = rtc_config.get("AppId", "")
assert_value(app_id, f"{scene_name} 场景的 RTCConfig.AppId 不能为空")
token = rtc_config.get("Token", "")
user_id = rtc_config.get("UserId", "")
room_id = rtc_config.get("RoomId", "")
app_key = rtc_config.get("AppKey", "")
if app_id and (not token or not user_id or not room_id):
rtc_config["RoomId"] = voice_chat["RoomId"] = room_id or str(uuid.uuid4())
rtc_config["UserId"] = user_id = user_id or str(uuid.uuid4())
if voice_chat.get("AgentConfig") and voice_chat["AgentConfig"].get("TargetUserId"):
voice_chat["AgentConfig"]["TargetUserId"][0] = rtc_config["UserId"]
assert_value(app_key, f"自动生成 Token 时,{scene_name} 场景的 AppKey 不可为空")
key = AccessToken(app_id, app_key, rtc_config["RoomId"], rtc_config["UserId"])
key.add_privilege(privileges["PrivSubscribeStream"], 0)
key.add_privilege(privileges["PrivPublishStream"], 0)
key.expire_time(int(time.time()) + 24 * 3600)
rtc_config["Token"] = key.serialize()
scene_config["id"] = scene_name
scene_config["botName"] = voice_chat.get("AgentConfig", {}).get("UserId")
scene_config["isInterruptMode"] = voice_chat.get("Config", {}).get("InterruptMode") == 0
scene_config["isVision"] = (
voice_chat.get("Config", {}).get("LLMConfig", {}).get("VisionConfig", {}).get("Enable")
)
scene_config["isScreenMode"] = (
voice_chat.get("Config", {}).get("LLMConfig", {})
.get("VisionConfig", {}).get("SnapshotConfig", {}).get("StreamType") == 1
)
scene_config["isAvatarScene"] = (
voice_chat.get("Config", {}).get("AvatarConfig", {}).get("Enabled")
)
scene_config["avatarBgUrl"] = (
voice_chat.get("Config", {}).get("AvatarConfig", {}).get("BackgroundUrl")
)
rtc_out = {k: v for k, v in rtc_config.items() if k != "AppKey"}
scenes_list.append({
"scene": scene_config,
"rtc": rtc_out,
})
return JSONResponse({
"ResponseMetadata": {"Action": "getScenes"},
"Result": {"scenes": scenes_list},
})
except ValueError as e:
return JSONResponse({
"ResponseMetadata": {
"Action": "getScenes",
"Error": {"Code": -1, "Message": str(e)},
}
})
if __name__ == "__main__":
import uvicorn
uvicorn.run("main:app", host="0.0.0.0", port=3001, reload=True)

4
backend/requirements.txt Normal file
View File

@ -0,0 +1,4 @@
fastapi>=0.110.0
uvicorn[standard]>=0.29.0
httpx>=0.27.0
python-multipart>=0.0.9

View File

@ -0,0 +1,75 @@
{
"SceneConfig": {
"icon": "https://lf3-rtc-demo.volccdn.com/obj/rtc-aigc-assets/DoubaoAvatar.png",
"name": "自定义助手"
},
"AccountConfig": {
"accessKeyId": "",
"secretKey": ""
},
"RTCConfig": {
"AppId": "",
"AppKey": "",
"RoomId": "",
"UserId": "",
"Token": ""
},
"VoiceChat": {
"AppId": "",
"RoomId": "",
"TaskId": "",
"AgentConfig": {
"TargetUserId": [
""
],
"WelcomeMessage": "你好,我是小宁,有什么需要帮忙的吗?",
"UserId": "",
"EnableConversationStateCallback": true
},
"Config": {
"ASRConfig": {
"Provider": "volcano",
"ProviderParams": {
"Mode": "smallmodel",
"AppId": "",
"Cluster": "volcengine_streaming_common"
}
},
"TTSConfig": {
"Provider": "volcano",
"ProviderParams": {
"app": {
"appid": "",
"cluster": "volcano_tts"
},
"audio": {
"voice_type": "BV001_streaming",
"speed_ratio": 1,
"pitch_ratio": 1,
"volume_ratio": 1
}
}
},
"LLMConfig": {
"Mode": "ArkV3",
"EndPointId": "",
"SystemMessages": [
"你是小宁,性格幽默又善解人意。你在表达时需简明扼要,有自己的观点。"
],
"VisionConfig": {
"Enable": false
}
},
"AvatarConfig": {
"Enabled": false,
"AvatarType": "3min",
"AvatarRole": "250623-zhibo-linyunzhi",
"BackgroundUrl": "",
"VideoBitrate": 2000,
"AvatarAppID": "",
"AvatarToken": ""
},
"InterruptMode": 0
}
}
}

112
backend/signer.py Normal file
View File

@ -0,0 +1,112 @@
"""
Copyright 2025 Beijing Volcano Engine Technology Co., Ltd. All Rights Reserved.
SPDX-license-identifier: BSD-3-Clause
Migrated from @volcengine/openapi Signer (AWS SigV4 compatible)
Reference: https://www.volcengine.com/docs/6348/69828
"""
import hashlib
import hmac
import json
from datetime import datetime, timezone
from urllib.parse import quote
def _sha256_hex(data: bytes) -> str:
return hashlib.sha256(data).hexdigest()
def _hmac_sha256(key: bytes, data: str) -> bytes:
return hmac.new(key, data.encode("utf-8"), hashlib.sha256).digest()
def _get_signing_key(secret_key: str, date_str: str, region: str, service: str) -> bytes:
k_date = _hmac_sha256(("HMAC-SHA256" + secret_key).encode("utf-8"), date_str)
k_region = _hmac_sha256(k_date, region)
k_service = _hmac_sha256(k_region, service)
k_signing = _hmac_sha256(k_service, "request")
return k_signing
class Signer:
"""
Signs requests to Volcengine OpenAPI using AWS SigV4-compatible signing.
"""
def __init__(self, request_data: dict, service: str):
"""
request_data: {
region: str,
method: str,
params: dict, # query params (Action, Version, ...)
headers: dict,
body: dict,
}
service: e.g. "rtc"
"""
self.region = request_data.get("region", "cn-north-1")
self.method = request_data.get("method", "POST").upper()
self.params = request_data.get("params", {})
self.headers = request_data.get("headers", {})
self.body = request_data.get("body", {})
self.service = service
def add_authorization(self, account_config: dict):
"""
Computes and injects Authorization + X-Date headers into self.headers.
account_config: { accessKeyId: str, secretKey: str }
"""
access_key = account_config["accessKeyId"]
secret_key = account_config["secretKey"]
now = datetime.now(timezone.utc)
date_str = now.strftime("%Y%m%d")
datetime_str = now.strftime("%Y%m%dT%H%M%SZ")
self.headers["X-Date"] = datetime_str
self.headers["X-Content-Sha256"] = _sha256_hex(
json.dumps(self.body, separators=(",", ":"), ensure_ascii=False).encode("utf-8")
)
# Canonical headers: sorted lowercase header names
signed_header_names = sorted(k.lower() for k in self.headers)
canonical_headers = "".join(
f"{k}:{self.headers[next(h for h in self.headers if h.lower() == k)]}\n"
for k in signed_header_names
)
signed_headers_str = ";".join(signed_header_names)
# Canonical query string
sorted_params = sorted(self.params.items())
canonical_qs = "&".join(
f"{quote(str(k), safe='')}={quote(str(v), safe='')}"
for k, v in sorted_params
)
# Canonical request
body_hash = self.headers["X-Content-Sha256"]
canonical_request = "\n".join([
self.method,
"/",
canonical_qs,
canonical_headers,
signed_headers_str,
body_hash,
])
credential_scope = f"{date_str}/{self.region}/{self.service}/request"
string_to_sign = "\n".join([
"HMAC-SHA256",
datetime_str,
credential_scope,
_sha256_hex(canonical_request.encode("utf-8")),
])
signing_key = _get_signing_key(secret_key, date_str, self.region, self.service)
signature = hmac.new(signing_key, string_to_sign.encode("utf-8"), hashlib.sha256).hexdigest()
self.headers["Authorization"] = (
f"HMAC-SHA256 Credential={access_key}/{credential_scope}, "
f"SignedHeaders={signed_headers_str}, "
f"Signature={signature}"
)

102
backend/token.py Normal file
View File

@ -0,0 +1,102 @@
"""
Copyright 2025 Beijing Volcano Engine Technology Co., Ltd. All Rights Reserved.
SPDX-license-identifier: BSD-3-Clause
Migrated from Server/token.js
"""
import hashlib
import hmac
import random
import struct
import time
import base64
VERSION = "001"
VERSION_LENGTH = 3
APP_ID_LENGTH = 24
_random_nonce = random.randint(0, 0xFFFFFFFF)
privileges = {
"PrivPublishStream": 0,
"privPublishAudioStream": 1,
"privPublishVideoStream": 2,
"privPublishDataStream": 3,
"PrivSubscribeStream": 4,
}
class ByteBuf:
def __init__(self):
self._buf = bytearray()
def put_uint16(self, v: int) -> "ByteBuf":
self._buf += struct.pack("<H", v)
return self
def put_uint32(self, v: int) -> "ByteBuf":
self._buf += struct.pack("<I", v)
return self
def put_bytes(self, b: bytes) -> "ByteBuf":
self.put_uint16(len(b))
self._buf += b
return self
def put_string(self, s: str) -> "ByteBuf":
return self.put_bytes(s.encode("utf-8"))
def put_tree_map_uint32(self, m: dict) -> "ByteBuf":
if not m:
self.put_uint16(0)
return self
self.put_uint16(len(m))
for key, value in m.items():
self.put_uint16(int(key))
self.put_uint32(int(value))
return self
def pack(self) -> bytes:
return bytes(self._buf)
def _encode_hmac(key: str, message: bytes) -> bytes:
return hmac.new(key.encode("utf-8"), message, hashlib.sha256).digest()
class AccessToken:
def __init__(self, app_id: str, app_key: str, room_id: str, user_id: str):
self.app_id = app_id
self.app_key = app_key
self.room_id = room_id
self.user_id = user_id
self.issued_at = int(time.time())
self.nonce = _random_nonce
self.expire_at = 0
self._privileges: dict = {}
def add_privilege(self, privilege: int, expire_timestamp: int):
self._privileges[privilege] = expire_timestamp
if privilege == privileges["PrivPublishStream"]:
self._privileges[privileges["privPublishVideoStream"]] = expire_timestamp
self._privileges[privileges["privPublishAudioStream"]] = expire_timestamp
self._privileges[privileges["privPublishDataStream"]] = expire_timestamp
def expire_time(self, expire_timestamp: int):
self.expire_at = expire_timestamp
def _pack_msg(self) -> bytes:
buf = ByteBuf()
buf.put_uint32(self.nonce)
buf.put_uint32(self.issued_at)
buf.put_uint32(self.expire_at)
buf.put_string(self.room_id)
buf.put_string(self.user_id)
buf.put_tree_map_uint32(self._privileges)
return buf.pack()
def serialize(self) -> str:
msg = self._pack_msg()
signature = _encode_hmac(self.app_key, msg)
content = ByteBuf().put_bytes(msg).put_bytes(signature).pack()
return VERSION + self.app_id + base64.b64encode(content).decode("utf-8")

View File

Before

Width:  |  Height:  |  Size: 3.2 KiB

After

Width:  |  Height:  |  Size: 3.2 KiB

View File

Before

Width:  |  Height:  |  Size: 2.3 KiB

After

Width:  |  Height:  |  Size: 2.3 KiB

View File

Before

Width:  |  Height:  |  Size: 841 B

After

Width:  |  Height:  |  Size: 841 B

View File

Before

Width:  |  Height:  |  Size: 1.2 KiB

After

Width:  |  Height:  |  Size: 1.2 KiB

View File

Before

Width:  |  Height:  |  Size: 965 B

After

Width:  |  Height:  |  Size: 965 B

View File

Before

Width:  |  Height:  |  Size: 440 B

After

Width:  |  Height:  |  Size: 440 B

View File

Before

Width:  |  Height:  |  Size: 758 B

After

Width:  |  Height:  |  Size: 758 B

View File

Before

Width:  |  Height:  |  Size: 1.6 KiB

After

Width:  |  Height:  |  Size: 1.6 KiB

View File

Before

Width:  |  Height:  |  Size: 8.4 KiB

After

Width:  |  Height:  |  Size: 8.4 KiB

View File

Before

Width:  |  Height:  |  Size: 2.4 KiB

After

Width:  |  Height:  |  Size: 2.4 KiB

View File

Before

Width:  |  Height:  |  Size: 2.3 KiB

After

Width:  |  Height:  |  Size: 2.3 KiB

View File

Before

Width:  |  Height:  |  Size: 2.5 KiB

After

Width:  |  Height:  |  Size: 2.5 KiB

View File

Before

Width:  |  Height:  |  Size: 1.2 KiB

After

Width:  |  Height:  |  Size: 1.2 KiB

View File

Before

Width:  |  Height:  |  Size: 1.2 KiB

After

Width:  |  Height:  |  Size: 1.2 KiB

View File

Before

Width:  |  Height:  |  Size: 1.2 KiB

After

Width:  |  Height:  |  Size: 1.2 KiB

View File

Before

Width:  |  Height:  |  Size: 332 KiB

After

Width:  |  Height:  |  Size: 332 KiB

View File

Before

Width:  |  Height:  |  Size: 1.1 KiB

After

Width:  |  Height:  |  Size: 1.1 KiB

View File

Before

Width:  |  Height:  |  Size: 14 KiB

After

Width:  |  Height:  |  Size: 14 KiB

Some files were not shown because too many files have changed in this diff Show More