Skip to main content

Architecture Overview

The HoloMIT SDK is composed of modular systems that work together to enable real-time holoportation, streaming, interaction, and networking. Each system is isolated, extensible, and can be selectively enabled based on the needs of your application.

This SDK follows a modular, stream-oriented architecture that separates media capture, processing, distribution, and rendering into clearly defined components. This architecture enables real-time performance, scalability, and platform flexibility (Local, LAN, or Cloud deployments).


Core System Modules

ModulePurpose
CoreFoundation for threading, diagnostics, logging, and plugin extensibility
Volumetric VideoReal-time 3D capture, encoding, distribution, and rendering
2D VideoWebCam capture and video streaming using traditional codecs
AudioMicrophone capture, loopback preview, streaming with compression
CloudSession management, media orchestration, networking, and remote rendering
InteractionsXR-based interaction logic: raycasting, grabbing, teleportation, etc.
Advanced EnvironmentsSupport for Gaussian Splatting, 360° viewers, and special rendering modes
MetricsSystem performance tracking and telemetry, including Grafana integration

Each of these modules may internally consist of multiple components and submodules (e.g., Distributors inside Cloud, or Codecs inside Volumetric/Audio).


Extending Modules

HoloMIT SDK modules are designed to be extended or replaced. Developers can:

  • Implement or inherit from core interfaces (e.g., IUserRepresentationHandler) to override default behavior.
  • Inject custom logic via plugin entry points.
  • Add additional pipeline stages (e.g., for AI post-processing, filtering, etc.).
  • Create your own objects and register them through registrators.

This architecture encourages clean, testable, and scalable extensions — making the SDK adaptable for both rapid prototyping and production-grade applications.


Threading by Default

Most system modules offload heavy operations to background threads:

  • Capture
  • Compression and decompression
  • Network I/O and stream buffering
  • Some rendering pre-process

This ensures minimal impact on Unity’s main thread and improves runtime stability in XR environments.


High-Level Layers

The architecture can be viewed as a stack of logical layers:

1. Capture Layer

  • Devices such as depth cameras, webcams, and microphones act as capturing devices.
  • Volumetric capture includes both raw data acquisition and 3D reconstruction before proceeding to compression.

2. Compression Layer

  • Specialized codecs encode and decode the captured data for efficient transport:
    • HoloCodec for volumetric video
    • H264/H265 for 2D video
    • Opus/Speex for audio
  • This step often includes adaptive bitrate management and LOD reduction for scalability.
  • Clients receive and decode media streams for real-time playback.
  • The encoding/decoding process is optimized for parallel execution and real-time frame delivery.

3. Transport Layer (Sending/Receiving)

  • Encoded data is transmitted over the network using MediaWriters.
  • Encoded data is received from the network through MediaReaders.
  • Depending on configuration, this may occur:
    • Locally (Local)
    • Via a local server (LAN)
    • Via a cloud backend (Cloud)

4. Rendering Layer

  • Media is rendered in Unity using dedicated renderers:
    • Volumetric renderer with spatial alignment
    • 2D video planes or texture streaming
    • Audio sources with spatialization
  • Anchors and session state determine media positioning in XR space.

Media Pipeline Flow

[Capture (with Reconstruction)] 
→ [Compression]
→ [Sending/Receiving]
→ [Decoding]
→ [Rendering]

This pipeline applies independently to each media type:


Session-Centric Runtime

At runtime, everything is coordinated by a Session entity, which acts as the central context for:

  • Media pipelines
  • Connected users and clients
  • Networked interactions
  • Lifecycle events

Multiple sessions can coexist and be dynamically created or destroyed.


Pipeline Extensibility

HoloMIT SDK is designed for extension at multiple levels:

  • Workers: Add custom threaded logic to any processing stage.
  • Codecs: Integrate new compression algorithms.
  • Writers/Readers: Swap out the default transport layer with WebRTC, or proprietary systems.
  • Renderers: Customize your own rendering methodology.

Deployment Modes

The architecture supports several operational configurations:

ModeDescription
LocalEverything runs on a single machine (ideal for testing or single-user experiences)
LANMulti-user experience using a locally deployed server over LAN network
CloudFull multi-user streaming with cloud orchestration + scaling

Deployment mode is configured at runtime with no need to change project structure or scenes.