Scholars@Duke publication: LOBSTAR: Language Model-based Obstruction Detection for Augmented Reality

LOBSTAR: Language Model-based Obstruction Detection for Augmented Reality

Publication , Conference

Xiu, Y; Scargill, T; Gorlatova, M

Published in: Proceedings 2024 IEEE International Symposium on Mixed and Augmented Reality Adjunct Ismar Adjunct 2024

January 1, 2024

In Augmented Reality (AR), improper virtual content placement can obstruct real-world elements, causing confusion and degrading the experience. To address this, we present LOBSTAR (Language model-based OBSTruction detection for Augmented Reality), the first system leveraging a vision language model (VLM) to detect key objects and prevent obstructions in AR. We evaluated LOBSTAR using both real-world and virtual-scene images and developed a mobile app for AR content obstruction detection. Our results demonstrate that LOBSTAR effectively understands scenes and detects obstructive content with well-designed VLM prompts, achieving up to 96% accuracy and a detection latency of 580ms on a mobile app.

Duke Scholars

Author Maria Gorlatova Electrical and Computer Engineering

Published In

Proceedings 2024 IEEE International Symposium on Mixed and Augmented Reality Adjunct Ismar Adjunct 2024

DOI

10.1109/ISMAR-Adjunct64951.2024.00078

Publication Date

January 1, 2024

Start / End Page

335 / 336

Citation

APA

Chicago

ICMJE

MLA

NLM

Xiu, Y., Scargill, T., & Gorlatova, M. (2024). LOBSTAR: Language Model-based Obstruction Detection for Augmented Reality. In Proceedings 2024 IEEE International Symposium on Mixed and Augmented Reality Adjunct Ismar Adjunct 2024 (pp. 335–336). https://doi.org/10.1109/ISMAR-Adjunct64951.2024.00078

Xiu, Y., T. Scargill, and M. Gorlatova. “LOBSTAR: Language Model-based Obstruction Detection for Augmented Reality.” In Proceedings 2024 IEEE International Symposium on Mixed and Augmented Reality Adjunct Ismar Adjunct 2024, 335–36, 2024. https://doi.org/10.1109/ISMAR-Adjunct64951.2024.00078.

Xiu Y, Scargill T, Gorlatova M. LOBSTAR: Language Model-based Obstruction Detection for Augmented Reality. In: Proceedings 2024 IEEE International Symposium on Mixed and Augmented Reality Adjunct Ismar Adjunct 2024. 2024. p. 335–6.

Xiu, Y., et al. “LOBSTAR: Language Model-based Obstruction Detection for Augmented Reality.” Proceedings 2024 IEEE International Symposium on Mixed and Augmented Reality Adjunct Ismar Adjunct 2024, 2024, pp. 335–36. Scopus, doi:10.1109/ISMAR-Adjunct64951.2024.00078.

Xiu Y, Scargill T, Gorlatova M. LOBSTAR: Language Model-based Obstruction Detection for Augmented Reality. Proceedings 2024 IEEE International Symposium on Mixed and Augmented Reality Adjunct Ismar Adjunct 2024. 2024. p. 335–336.

Published In

Proceedings 2024 IEEE International Symposium on Mixed and Augmented Reality Adjunct Ismar Adjunct 2024

DOI

10.1109/ISMAR-Adjunct64951.2024.00078

Publication Date

January 1, 2024

Start / End Page

335 / 336