RISE: Single Static Radar-based Indoor Scene Understanding

Abstract

Indoor scene understanding is fundamental to numerous applications, yet existing approaches rely on optical sensors that face challenges with occlusions and raise privacy concerns. Millimeter-wave (mmWave) radar offers a compelling privacy-preserving alternative—it can penetrate obstacles and operate regardless of lighting conditions. However, the inherently low spatial resolution of radar and complex multipath propagation in indoor environments make accurate scene understanding challenging.

We introduce RISE, a system that jointly performs indoor layout reconstruction and object detection from a single static mmWave radar. Our key insight is to leverage multipath reflections as geometric information rather than suppress them as noise. RISE employs Bi-Angular Multipath Enhancement (BAME), which explicitly models both the Angle-of-Arrival (AoA) and Angle-of-Departure (AoD) to recover secondary reflections and reveal hidden structures, paired with a Sim2Real Hierarchical Diffusion (SRHD) framework that transforms fragmented observations into complete scene representations.

We introduce the first large-scale radar dataset for this task (50,000 frames across 100 real indoor trajectories). RISE achieves a 60% reduction in Chamfer Distance over prior work, reaching 16 cm accuracy, and delivers the first mmWave-based object detection at 58% IoU.

60%

Reduction in Chamfer Distance

vs. prior state-of-the-art

16cm

Layout Accuracy

room layout reconstruction

58%

Object Detection IoU

first mmWave-based detection

50K

Dataset Frames

100 real indoor trajectories

The Multipath Problem — Turned Advantage

Conventional radar systems suppress multipath reflections (signals bouncing off walls) as noise. RISE treats them as a rich geometric signal.

Multipath-induced ghost points in mmWave radar — **Fig. 2 — Multipath-induced Ghosts.** (a) Scenario where ghost points appear due to wall reflections. (b) The corresponding XY heatmap showing spurious detections. RISE inverts this effect to recover actual scene geometry.

Method

RISE consists of two tightly integrated components that transform raw mmWave radar signals into complete indoor scene representations.

RISE pipeline diagram: BAME and SRHD stages — **Fig. 3 — RISE Pipeline.** Raw mmWave radar signals are processed through Bi-Angular Multipath Enhancement (BAME) for geometric recovery, followed by the Sim2Real Hierarchical Diffusion (SRHD) model for complete scene reconstruction.

Component 1

Bi-Angular Multipath Enhancement (BAME)

Standard radar only measures the Angle-of-Arrival (AoA). BAME additionally models the Angle-of-Departure (AoD)—the direction a signal left the radar before reflecting. By jointly reasoning over both angles, RISE pinpoints the exact wall or surface a signal bounced off, effectively using reflections as virtual mirrors to see occluded regions.

Models AoA and AoD jointly per radar return
Recovers structures invisible to direct-path sensing
Converts multipath from noise into geometric signal

BAME visualization — BAME recovers ghost targets that cannot be found with conventional AoA-only processing.

Component 2

Sim2Real Hierarchical Diffusion (SRHD)

Even after multipath recovery, radar observations remain sparse and incomplete. SRHD is a diffusion-based generative model that progressively denoises radar heatmaps into dense layout reconstructions and object detections. A physics-based simulator generates abundant training data; domain adaptation bridges the gap to the real world.

Two-stage hierarchical diffusion: layout then objects
Sim-to-real transfer reduces annotation burden
Joint output: room floorplan + object bounding boxes

Sim2Real Hierarchical Diffusion architecture — SRHD's two-stage architecture: first predicting layout structure, then refining object-level detections.

Multipath Inversion

Given an apparent ghost target location, RISE computes the geometric relationship between the ghost and the reflecting wall to recover the true object position. This inversion is closed-form and uses the bi-angular measurements from BAME.

Results

RISE outperforms prior work across all metrics, evaluated on 100 real indoor trajectories.

Wall reconstruction across 100 trajectories — **Fig. 7 — Wall Reconstruction Across 100 Trajectories.** Comparison of RISE and the baseline (EMT) over 100 real-world trajectories, showing consistent improvements in layout accuracy.

Qualitative comparison between RISE and baseline — **Fig. 8 — Qualitative Comparison.** The first column shows RGB reference images. RISE (right) produces significantly more accurate layout reconstructions and object detections compared to the EMT baseline (middle).

Results across varying trajectory lengths — **Fig. 9 — Varying Trajectory Lengths.** RISE maintains strong performance across different observation window lengths, demonstrating robustness to deployment constraints.

Dataset

The first large-scale benchmark for radar-based indoor scene understanding.

50,000

Radar Frames

100

Indoor Trajectories

1st

Dataset of Its Kind

Simulator and data augmentation illustration — **Fig. 11 — Simulator & Data Augmentation.** Top row: ground truth from the physics-based simulator. Bottom row: augmented training data used to bridge the sim-to-real gap in SRHD.

The dataset covers diverse indoor environments (offices, living rooms, corridors) with paired ground truth for room layout and object locations. Each trajectory provides a sequence of radar snapshots from a single static mmWave sensor, annotated with floorplan geometry and object bounding boxes.

View Paper for Dataset Access

Why mmWave Radar?

Radar offers unique advantages over optical sensors for privacy-sensitive, challenging-condition deployments.

Capability	RGB Camera	LiDAR	mmWave Radar (RISE)
Low-light operation	✗	✓	✓
Privacy preserving	✗	✗	✓
Penetrates obstacles	✗	✗	✓
Low hardware cost	✓	✗	✓
Layout reconstruction	Limited	Limited	✓ (RISE)
Object detection	✓	✓	✓ (RISE — first)

BibTeX

@article{zhou2025rise,
  title     = {RISE: Single Static Radar-based Indoor Scene Understanding},
  author    = {Zhou, Kaichen and Dodds, Laura and Afzal, Sayed Saad and Adib, Fadel},
  journal   = {arXiv preprint arXiv:2511.14019},
  year      = {2025}
}