Scholars@Duke publication: Vision Language Model-Based Solution for Obstruction Attack in AR: A Meta Quest 3 Implementation

Vision Language Model-Based Solution for Obstruction Attack in AR: A Meta Quest 3 Implementation

Publication , Conference

Xiu, Y; Gorlatova, M

Published in: Proceedings 2025 IEEE Conference on Virtual Reality and 3D User Interfaces Abstracts and Workshops Vrw 2025

January 1, 2025

Published version (DOI)

Obstruction attacks in Augmented Reality (AR) pose significant challenges by obscuring critical real-world objects. This work demonstrates the first implementation of obstruction detection on a video see-through head-mounted display (HMD), the Meta Quest 3. Leveraging a vision language models (VLM) and a multi-modal object detection model, our system detects obstructions by analyzing both raw and augmented images. Due to limited access to raw camera feeds, the system employs an image-capturing approach using Oculus casting, capturing a sequence of images and finding the raw image from them. Our implementation showcases the feasibility of effective obstruction detection in AR environments and highlights future opportunities for improving real-time detection through enhanced camera access.

Duke Scholars

Author Maria Gorlatova Electrical and Computer Engineering

Published In

Proceedings 2025 IEEE Conference on Virtual Reality and 3D User Interfaces Abstracts and Workshops Vrw 2025

DOI

10.1109/VRW66409.2025.00464

Publication Date

January 1, 2025

Start / End Page

1638 / 1639

Citation

APA

Chicago

ICMJE

MLA

NLM

Xiu, Y., & Gorlatova, M. (2025). Vision Language Model-Based Solution for Obstruction Attack in AR: A Meta Quest 3 Implementation. In Proceedings 2025 IEEE Conference on Virtual Reality and 3D User Interfaces Abstracts and Workshops Vrw 2025 (pp. 1638–1639). https://doi.org/10.1109/VRW66409.2025.00464

Xiu, Y., and M. Gorlatova. “Vision Language Model-Based Solution for Obstruction Attack in AR: A Meta Quest 3 Implementation.” In Proceedings 2025 IEEE Conference on Virtual Reality and 3D User Interfaces Abstracts and Workshops Vrw 2025, 1638–39, 2025. https://doi.org/10.1109/VRW66409.2025.00464.

Xiu Y, Gorlatova M. Vision Language Model-Based Solution for Obstruction Attack in AR: A Meta Quest 3 Implementation. In: Proceedings 2025 IEEE Conference on Virtual Reality and 3D User Interfaces Abstracts and Workshops Vrw 2025. 2025. p. 1638–9.

Xiu, Y., and M. Gorlatova. “Vision Language Model-Based Solution for Obstruction Attack in AR: A Meta Quest 3 Implementation.” Proceedings 2025 IEEE Conference on Virtual Reality and 3D User Interfaces Abstracts and Workshops Vrw 2025, 2025, pp. 1638–39. Scopus, doi:10.1109/VRW66409.2025.00464.

Xiu Y, Gorlatova M. Vision Language Model-Based Solution for Obstruction Attack in AR: A Meta Quest 3 Implementation. Proceedings 2025 IEEE Conference on Virtual Reality and 3D User Interfaces Abstracts and Workshops Vrw 2025. 2025. p. 1638–1639.

Published In

Proceedings 2025 IEEE Conference on Virtual Reality and 3D User Interfaces Abstracts and Workshops Vrw 2025

DOI

10.1109/VRW66409.2025.00464

Publication Date

January 1, 2025

Start / End Page

1638 / 1639