Speex
The NSpeexEncoder handles legacy or bandwidth-constrained audio encoding for the HoloMIT SDK.
Implementation Details
The encoder inherits from VoiceEncoder (and ultimately BaseWorker) and operates asynchronously using thread-safe queues. It pulls raw audio frames (FloatMemoryChunk) from an input queue, compresses them, and pushes the resulting ByteMemoryChunk to an output queue.
- Implementation: Wraps the
NSpeexlibrary. - Location:
Assets\UserRepresentation\Voice\Scripts\Workers\Codecs\NSpeexEncoder.cs - Configuration:
- Band Mode:
Wide - Quality:
5
- Band Mode:
- Buffer Size: Dynamically assigned based on the Speex encoder's
FrameSize.
Metrics Tracking
The encoder utilizes an internal Stats class that measures the processing time (via System.Diagnostics.Stopwatch), incoming bytes, and outgoing bytes per frame. These metrics are periodically flushed to Prometheus or local log files depending on the global configuration (Config.Instance.Stats.DebugMetrics).
Decoder (VoiceDecoder)
Instead of having a separate decoder class, the system uses a unified VoiceDecoder script that handles Speex dynamically.
- When instantiated, it prepares a
SpeexDecoderinWideband mode. - It reads the format from the first packet's metadata (
_chunk.Info.Dsi). - Because the incoming stream is Mono but Unity's spatial audio or multi-channel setups might require more, the decoder duplicates the decoded mono signal across 6 channels into the final
FloatMemoryChunkbefore queueing it for theAudioPreparer.