Skip to main content

Opus

The OpusEncoder handles high-quality, low-latency VoIP audio encoding for the HoloMIT SDK.

Implementation Details

The encoder inherits from VoiceEncoder (and ultimately BaseWorker) and operates asynchronously using thread-safe queues. It pulls raw audio frames (FloatMemoryChunk) from an input queue, compresses them, and pushes the resulting ByteMemoryChunk to an output queue.

  • Implementation: Wraps the UnityOpus library.
  • Location: Assets\UserRepresentation\Voice\Scripts\Workers\Codecs\OpusEncoder.cs
  • Configuration:
    • Sampling Frequency: 16000 Hz
    • Channels: Mono
    • Application: VoIP
    • Bitrate: 96000 bps
    • Complexity: 10
    • Signal Type: Voice
  • Buffer Size: Fixed at 320 samples per frame.

Metrics Tracking

The encoder utilizes an internal Stats class that measures the processing time (via System.Diagnostics.Stopwatch), incoming bytes, and outgoing bytes per frame. These metrics are periodically flushed to Prometheus or local log files depending on the global configuration (Config.Instance.Stats.DebugMetrics).

Decoder (VoiceDecoder)

Instead of having a separate decoder class, the system uses a unified VoiceDecoder script that handles Opus dynamically.

  • When instantiated, it prepares a UnityOpus.Decoder (16000Hz, Mono).
  • It reads the format from the first packet's metadata (_chunk.Info.Dsi).
  • Because the incoming stream is Mono but Unity's spatial audio or multi-channel setups might require more, the decoder duplicates the decoded mono signal across 6 channels into the final FloatMemoryChunk before queueing it for the AudioPreparer.