Skip to main content

Vision Language Model-Based Solution for Obstruction Attack in AR: A Meta Quest 3 Implementation

Publication ,  Conference
Xiu, Y; Gorlatova, M
Published in: Proceedings 2025 IEEE Conference on Virtual Reality and 3D User Interfaces Abstracts and Workshops Vrw 2025
January 1, 2025

Obstruction attacks in Augmented Reality (AR) pose significant challenges by obscuring critical real-world objects. This work demonstrates the first implementation of obstruction detection on a video see-through head-mounted display (HMD), the Meta Quest 3. Leveraging a vision language models (VLM) and a multi-modal object detection model, our system detects obstructions by analyzing both raw and augmented images. Due to limited access to raw camera feeds, the system employs an image-capturing approach using Oculus casting, capturing a sequence of images and finding the raw image from them. Our implementation showcases the feasibility of effective obstruction detection in AR environments and highlights future opportunities for improving real-time detection through enhanced camera access.

Duke Scholars

Published In

Proceedings 2025 IEEE Conference on Virtual Reality and 3D User Interfaces Abstracts and Workshops Vrw 2025

DOI

Publication Date

January 1, 2025

Start / End Page

1638 / 1639
 

Citation

APA
Chicago
ICMJE
MLA
NLM
Xiu, Y., & Gorlatova, M. (2025). Vision Language Model-Based Solution for Obstruction Attack in AR: A Meta Quest 3 Implementation. In Proceedings 2025 IEEE Conference on Virtual Reality and 3D User Interfaces Abstracts and Workshops Vrw 2025 (pp. 1638–1639). https://doi.org/10.1109/VRW66409.2025.00464
Xiu, Y., and M. Gorlatova. “Vision Language Model-Based Solution for Obstruction Attack in AR: A Meta Quest 3 Implementation.” In Proceedings 2025 IEEE Conference on Virtual Reality and 3D User Interfaces Abstracts and Workshops Vrw 2025, 1638–39, 2025. https://doi.org/10.1109/VRW66409.2025.00464.
Xiu Y, Gorlatova M. Vision Language Model-Based Solution for Obstruction Attack in AR: A Meta Quest 3 Implementation. In: Proceedings 2025 IEEE Conference on Virtual Reality and 3D User Interfaces Abstracts and Workshops Vrw 2025. 2025. p. 1638–9.
Xiu, Y., and M. Gorlatova. “Vision Language Model-Based Solution for Obstruction Attack in AR: A Meta Quest 3 Implementation.” Proceedings 2025 IEEE Conference on Virtual Reality and 3D User Interfaces Abstracts and Workshops Vrw 2025, 2025, pp. 1638–39. Scopus, doi:10.1109/VRW66409.2025.00464.
Xiu Y, Gorlatova M. Vision Language Model-Based Solution for Obstruction Attack in AR: A Meta Quest 3 Implementation. Proceedings 2025 IEEE Conference on Virtual Reality and 3D User Interfaces Abstracts and Workshops Vrw 2025. 2025. p. 1638–1639.

Published In

Proceedings 2025 IEEE Conference on Virtual Reality and 3D User Interfaces Abstracts and Workshops Vrw 2025

DOI

Publication Date

January 1, 2025

Start / End Page

1638 / 1639