Skip to main content

Voice Chat

Real-time voice interactions with agents using LiveKit for audio streaming and text-to-speech narration.

Overview

Voice chat adds a spoken interface to agent conversations. The agent:

  1. Listens to your voice input via microphone
  2. Transcribes speech to text
  3. Processes the request (calling MCP tools as needed)
  4. Responds with synthesized speech
  5. Narrates tool calls and results for transparency

React integration

import { useAgentVoiceChat } from '@rickydata/react';

function VoiceChat() {
const {
isConnected,
phase, // 'idle' | 'connecting' | 'listening' | 'thinking' | 'speaking'
transcripts, // conversation transcript
toolCalls, // tools the agent called during this turn
connect,
disconnect,
} = useAgentVoiceChat({
agentId: 'erc8004-expert',
});

return (
<div>
<p>Status: {phase}</p>

<button onClick={isConnected ? disconnect : connect}>
{isConnected ? 'End call' : 'Start voice chat'}
</button>

<div>
{transcripts.map((t, i) => (
<p key={i}>
<strong>{t.speaker}:</strong> {t.text}
</p>
))}
</div>

{toolCalls.length > 0 && (
<div>
<h4>Tools used</h4>
{toolCalls.map((tc, i) => (
<p key={i}>{tc.name}: {tc.status}</p>
))}
</div>
)}
</div>
);
}

Voice phases

The phase property tracks the current state of the voice interaction:

PhaseDescription
idleNot connected
connectingEstablishing LiveKit connection
listeningMicrophone active, waiting for speech
thinkingAgent is processing (may be calling tools)
speakingAgent is speaking the response

Use computeVoicePhase() to derive the phase from raw connection state if building a custom UI.

Narration

Tool calls are narrated so you know what the agent is doing:

import { speakNarration, createNarration, humanizeToolName } from '@rickydata/react';

// Create a narration event for a tool call
const narration = createNarration({
toolName: 'brave_web_search',
status: 'calling',
});

// Speak it (uses browser TTS)
speakNarration(narration.text);

// humanizeToolName converts snake_case to readable form
humanizeToolName('brave_web_search'); // "Brave Web Search"

Narration settings

ConstantValueDescription
NARRATION_VOICE_RATE1.0TTS speech rate
NARRATION_VOICE_VOLUME0.7TTS volume (0-1)

Types

import type {
UseAgentVoiceChatOptions,
UseAgentVoiceChatResult,
NarrationEvent,
VoiceConnectionState,
VoicePhase,
VoiceTranscript,
VoiceToolCallInfo,
} from '@rickydata/react';

UseAgentVoiceChatOptions

OptionTypeDescription
agentIdstringAgent to connect to
sessionIdstring?Resume an existing session
modelstring?Override default model

VoiceTranscript

FieldTypeDescription
speaker'user' | 'agent'Who said this
textstringThe transcribed/synthesized text
timestampnumberWhen it was recorded

Next steps