Voice System
In the HoloMIT SDK, Voice is a fundamental element. It is present in virtually all online sessions, acting as the primary medium for users to communicate with each other. The system is designed to be highly optimized and seamlessly integrated across both user avatars and cloud content.
Overview
The Voice subsystem operates from its own dedicated library, Voice.DLL, and can be found at: Assets\UserRepresentation\Voice\.
Optimizations
Voice communication must maintain ultra-low latency and minimal resource overhead. HoloMIT achieves this through:
- Efficient Codecs: Support for lightweight, high-fidelity audio codecs like Speex and Opus.
- Spatial Audio: AudioSources are configured automatically with
spatialize = true,spatialBlend = 1.0f, and strict distance constraints (min 4f, max 100f) to simulate realistic 3D soundscapes without manual configuration. - Multithreading: Thread-safe queues (
QueueThreadSafe) are used extensively between reading, encoding, and preparing audio to avoid blocking the main thread. - Direct DSP Injection: The pipeline uses
OnAudioFilterReadto inject decoded audio buffers directly into Unity's audio DSP loop, reducing latency.
Initialization and Handlers
1. VoiceInitializer
The entry point for the module when the application loads.
- Registers the
UserRepresentationType.VOICE. - Maps the
VoiceRepresentationHandlerfor both users and cloud content. - Registers the
VoicePipelinecomponent factory. - Configured with a priority of 120 in the initialization execution order.
2. VoiceRepresentationHandler
Implements both IUserRepresentationHandler and IContentRepresentationHandler.
- Responsible for activating the target GameObject's audio component.
- Attaches the
VoicePipelinescript and invokes itsInit()function, passing contextual details such as whether the entity is a producer (local microphone) or consumer (remote speaker).
For detailed information on how audio data is processed, refer to the Voice Pipeline documentation.