当前位置: 首页 > news >正文

AI-Sphere-Butler之如何将豆包桌面版对接到AI全能管家~新玩法(一)

环境:

AI-Sphere-Butler

VBCABLE2.1.58

Win10专业版

豆包桌面版1.47.4

ubuntu22.04

英伟达4070ti 12G

python3.10

问题描述:

AI-Sphere-Butler之如何将豆包桌面版对接到AI全能管家~新玩法(一)

在这里插入图片描述
在这里插入图片描述

聊天视频:

AI真人版豆包来了,AI全能管家新玩法。

解决方案:

1.先安装VBCABLE2.1.58工具,采集豆包音频

“VBCABLE_Driver_Pack45win10” 指的是适用于 Windows 10 系统的 VBCABLE 驱动程序包,版本号可能是 45 。“VBCABLE” 可能是该驱动相关的产品或技术名称,“Driver_Pack” 明确是驱动程序包,而 “win10” 表明其适用的操作系统为 Windows 10 。例如,可能是一种虚拟音频电缆相关的驱动包,用于在 Windows 10 系统上实现特定音频功能。

下载完软件安装x64版

在这里插入图片描述
继续安装
在这里插入图片描述
2.打开电脑声音设置找到应用音量和设备首选项
在这里插入图片描述
3.将豆包程序的输出设备选择CABLEInput

在这里插入图片描述

4.自行安装python和安装依赖:

pip install flask flask-sockets gevent gevent-websocket

5.编写采集豆包声音客户端

Collection.py文件内容:

import asyncio
import sounddevice as sd
import websockets
import numpy as np
import signal
import threading
import time
from collections import dequeINPUT_RATE = 16000
CHANNELS = 1
FRAME_SIZE = 640  
WS_URL = "ws://192.168.1.4:8020"#websockets服务地址
SILENCE_THRESHOLD = 1000stop_event = threading.Event()
signal.signal(signal.SIGINT, lambda s, f: stop_event.set())class AudioBuffer:def __init__(self, max_frames=20):self.buffer = deque(maxlen=max_frames)self.lock = threading.Lock()def put(self, frame_bytes):with self.lock:if len(self.buffer) == self.buffer.maxlen:self.buffer.popleft()print("[BUF] Buffer full, dropping oldest frame")self.buffer.append(frame_bytes)def get_all(self):with self.lock:frames = list(self.buffer)self.buffer.clear()return framesdef size(self):with self.lock:return len(self.buffer)def is_voice(data_np):energy = np.mean(data_np.astype(np.float32) ** 2)return energy > SILENCE_THRESHOLDdef audio_callback(indata, frames, time_info, status, audio_buffer):if status:print(f"[CAP] Warning: {status}")audio_np = indata[:, 0]ts = time.time()if is_voice(audio_np):frame = audio_np.tobytes()#print(f"[CAP] Voice frame captured at {ts:.3f}s, energy sufficient")else:frame = (np.zeros_like(audio_np)).tobytes()#print(f"[CAP] Silence frame at {ts:.3f}s")audio_buffer.put(frame)async def sender(ws, audio_buffer):while not stop_event.is_set():frames = audio_buffer.get_all()if not frames:await asyncio.sleep(0.005)continuefor frame in frames:try:await ws.send(frame)#print(f"[SND] Sent frame size={len(frame)} at {time.time():.3f}s, buffer size={audio_buffer.size()}")except Exception as e:print(f"[SND] Send error: {e}")stop_event.set()returnasync def capture_and_send(ws):audio_buffer = AudioBuffer(20)device_index = Nonedevices = sd.query_devices()for i, d in enumerate(devices):if "CABLE" in d['name'] and d['max_input_channels'] >= CHANNELS:device_index = ibreakif device_index is None:device_index = sd.default.device[0]print(f"[SYS] Using device #{device_index}: {devices[device_index]['name']}")send_task = asyncio.create_task(sender(ws, audio_buffer))with sd.InputStream(samplerate=INPUT_RATE,device=device_index,channels=CHANNELS,dtype='int16',blocksize=FRAME_SIZE,callback=lambda indata, frames, time_info, status:audio_callback(indata, frames, time_info, status, audio_buffer)):print("[SYS] Recording started.")while not stop_event.is_set():await asyncio.sleep(0.1)send_task.cancel()try:await send_taskexcept asyncio.CancelledError:passprint("[SYS] Recording stopped.")async def main():print(f"[SYS] Connecting to {WS_URL}")try:async with websockets.connect(WS_URL) as ws:print("[SYS] Connected.")await capture_and_send(ws)except Exception as e:print(f"[ERR] Connection error: {e}")if __name__ == '__main__':asyncio.run(main())

6.主程序引入模块文件websocket_service.py:

AI-Sphere-Butler\core\server\virtual_human\websocket_service.py

import asyncio
import uuid
import websockets
import multiprocessing
import queueMAX_QUEUE_SIZE = 10def enqueue_audio_data(audio_queue, data):try:audio_queue.put_nowait(data)except queue.Full:try:discarded = audio_queue.get_nowait()print("[WSrv] 丢弃过旧音频包,防止积压")except queue.Empty:passtry:audio_queue.put_nowait(data)except queue.Full:# print("[WSrv] 队列满,丢弃当前音频包")passasync def audio_handler(websocket, audio_queue: multiprocessing.Queue):session_id = str(uuid.uuid4())# print(f"[WSrv] Session {session_id} connected")try:async for raw in websocket:if isinstance(raw, (bytes, bytearray)):enqueue_audio_data(audio_queue, (session_id, raw))# print(f"[WSrv] Queued {len(raw)} bytes from {session_id}")else:# print(f"[WSrv] Ignored non-binary message from {session_id}")passexcept websockets.exceptions.ConnectionClosed:passfinally:# print(f"[WSrv] Session {session_id} disconnected")passasync def run_server(audio_queue: multiprocessing.Queue, host='0.0.0.0', port=8020):async def handler(websocket):await audio_handler(websocket, audio_queue)server = await websockets.serve(handler, host, port)# print(f"[WSrv] Listening on ws://{host}:{port}")await asyncio.Future()  if __name__ == "__main__":q = multiprocessing.Queue(maxsize=MAX_QUEUE_SIZE)asyncio.run(run_server(q))

7.运行采集客户端和AI-Sphere-Butler服务

在这里插入图片描述

8.这样就可以和豆包聊天,驱动AI全能管家数字人说话了

在这里插入图片描述

http://www.lqws.cn/news/498961.html

相关文章:

  • Redis基本介绍
  • 词编码模型怎么进行训练的,输出输入是什么,标签是什么
  • leetcode:98. 验证二叉搜索树
  • oracle 表空间与实例妙用,解决业务存储与权限处理难题
  • 企业主动风险管理破局供应链“黑天鹅”,善用期货
  • C# Task 模式实现 Demo(含运行、暂停、结束状态)
  • Redis精简总结|一主二从哨兵模式(工作机制)|集群模式|缓存的穿透雪崩击穿
  • 以计数器程序为例,简析JVM内存模型中各部分的工作方式
  • 72-Oralce Temporay tablespace(单实例和多租户下的管理)
  • 互联网大数据求职面试:从Zookeeper到Flink的技术探讨
  • 华为云Flexus+DeepSeek征文|基于Dify构建抓取金融新闻并发送邮箱工作流
  • 实现 “WebView2 获取word选中内容
  • 板凳-------Mysql cookbook学习 (十--9)
  • TCP客户端发送消息失败(NetAssist做客户端)
  • Java底层原理:深入理解JVM内存管理机制
  • 在Springboot项目部署时遇到,centos服务器上,curl请求目标地址不通 ,curl -x 可以请求通的解决办法
  • AWS服务器扩充硬盘
  • 汽车制造领域:EtherCAT转Profinet网关案例全面解析
  • Threejs实现 3D 看房效果
  • 基于ASP4644多通道降压技术在电力监测系统中集成应用与发展前景
  • 使用Windows自带的WSL安装Ubuntu Linux系统
  • Python 数据分析与可视化 Day 5 - 数据可视化入门(Matplotlib Seaborn)
  • 《Redis高并发优化策略与规范清单:从开发到运维的全流程指南》
  • 打包winform
  • 使用uv安装python任意版本,命令:uv python install
  • 数组题解——​最大子数组和​【LeetCode】(更新版)
  • (nice!!!)(LeetCode 每日一题) 2081. k 镜像数字的和 (枚举)
  • (cvpr2025) DefMamba: Deformable Visual State Space Model
  • 008 Linux 开发工具(下) —— make、Makefile、git和gdb
  • VitePress搭建静态博客