Core Concepts & Capabilities
The HoloMIT SDK is built around a set of modular systems designed to simplify the creation of volumetric, networked, and interactive XR experiences. This section outlines the foundational concepts that make up the SDK and how they interconnect.
Modular System Architecture
Each functional area of HoloMIT is encapsulated in independent modules that can be used together or in isolation:
-
Core
Multithreaded worker architecture, thread-safe queue system, logging, plugin points, and extensibility backbone for the entire SDK. -
Volumetric Video Real-time capture, compression, and rendering of 3D video streams using depth-sensing cameras.
-
2D Video
Webcam-based video capture, compression and streaming pipelines using standard codecs (H264, H265). -
Audio
Spatial audio capture and streaming using Opus/Speex codecs. -
Cloud
Cloud orchestration and backend infrastructure, organized into the following submodules:-
Session Management – User authentication, multi-session orchestration, media/event/session managers.
-
Networking – Session-based player instantiation, object transform synchronization, networked triggers (normal, float, int, byte).
-
Distributors – Systems that handle routing and delivery of encoded media to clients.
-
Streaming – Support for real-time session streaming and server-side remote rendering.
-
Scalability & Optimization – Systems designed for adaptive performance in real-world conditions, like network congestions.
-
-
Interactions
XR-focused interaction system for teleportation, raycasting, grabbing, and hand-based input. -
Advanced Environments
Support for novel spatial representations like Gaussian Splatting and immersive 360° video viewers. -
Metrics
Real-time performance monitoring, session diagnostics and Grafana dashboard integration.
Each module can be toggled or stripped based on target platform and feature needs.
Stream-Centric Pipeline
At the heart of HoloMIT is the Session: a temporal context that connects users, media sources (volumetric, 2D, audio).
Media flows through highly optimized pipelines:
- Captured data → Compressed → Routed to targets → Decompressed → Rendered.
- This architecture enables live, low-latency streaming across local and cloud setups.
Multithreaded & Extensible
Performance-critical components run in parallel using multithreaded workers with custom lock-free queues. Developers can:
- Extend internal workers to customize behavior.
- Integrate native plugins (e.g., CUDA, OpenCV) for acceleration.
- Use plugin hooks to inject custom behavior into the pipelines.
- User the multithreaded system to create new features or optimized operations.
Developer-Centric Design
HoloMIT is built for developers first:
- Easy-to-use prefabs and templates.
- C# APIs for full control when needed.
- Plugin entry points for customization.
- Minimal setup time — results in minutes.
Whether you're building a production XR application or a research prototype, HoloMIT adapts to your use case.
Conceptual Diagram
(To be inserted later: a diagram showing Sessions, Media Pipelines, Networking, and Anchors as interconnected modules)
Key Terms
-
Capture Node – Any physical volumetric capturing space, provided by depth cameras. It includes the device driver, data acquisition pipeline, and calibration logic.
-
Pipeline – A configurable processing flow that handles data from capture to rendering. Pipelines typically consist of capture → compression → sending/receiving → decoding → rendering stages.
-
Orchestrator – The high-level system (often cloud-side) that manages session coordination, media routing, and synchronization across all clients connected to a session.
-
Session – The central runtime context linking all components. It manages user identities, scene context, media streams, and networking configuration.
-
LOD (Level of Detail) – A technique used to adapt media fidelity dynamically based on bandwidth, device capabilities, or distance to the viewer, improving performance and scalability.
-
Distributor – A component responsible for routing encoded data (volumetric video or 2D video) to remote clients in real time.
-
SyncObject – An object whose transform or parameters (e.g., position, rotation, state) are synchronized across networked clients inside a session.
-
Worker – A multithreaded process that handles tasks such as encoding, decoding, or analytics outside the main Unity thread, enabling high-performance parallelism.
-
Trigger – A synchronized event that propagates a discrete change (e.g., play/pause, parameter change) across networked clients, supporting multiple data types (float, int, byte, etc.).