Skip to main content
Journal cover image

Point2pix-Zero: Point-driven refined diffusion for multi-object image editing

Publication ,  Journal Article
Wang, S; Yan, Y; Yang, X; Zhang, R; Wang, Q; Cheng, G; Huang, K
Published in: Pattern Recognition
February 1, 2026

Semantic image editing methods employing large-scale diffusion models have made significant strides in precise and controlled image editing with text prompts as guidance. However, these models struggle to handle complex images containing hard-described objects and/or multiple objects. In this work, we introduce a novel inference-time multi-object image editing strategy, Point2pix-Zero, editing a single object with the simple guidance of clicked points and the text of target objects. We employ an interactive methodology, point-discovery, as text-free guidance to identify the semantic information of intended edited objects and generate text prompts automatically. Instead of exploiting internal cross-attention maps of diffusion models as a guide, we inject external attention maps to rectify the visual-and-semantic pairing mismatches in cross-attention maps during the denoising process. Extensive empirical evaluations demonstrate the effectiveness of our proposed inference-time method in ensuring precise editing while maintaining image fidelity. Our method showcases superior performance in single- and multi-object image editing, positioning it as a new state-of-the-art.

Duke Scholars

Published In

Pattern Recognition

DOI

ISSN

0031-3203

Publication Date

February 1, 2026

Volume

170

Related Subject Headings

  • Artificial Intelligence & Image Processing
  • 4611 Machine learning
  • 4605 Data management and data science
  • 4603 Computer vision and multimedia computation
  • 0906 Electrical and Electronic Engineering
  • 0806 Information Systems
  • 0801 Artificial Intelligence and Image Processing
 

Citation

APA
Chicago
ICMJE
MLA
NLM
Wang, S., Yan, Y., Yang, X., Zhang, R., Wang, Q., Cheng, G., & Huang, K. (2026). Point2pix-Zero: Point-driven refined diffusion for multi-object image editing (Accepted). Pattern Recognition, 170. https://doi.org/10.1016/j.patcog.2025.112041
Wang, S., Y. Yan, X. Yang, R. Zhang, Q. Wang, G. Cheng, and K. Huang. “Point2pix-Zero: Point-driven refined diffusion for multi-object image editing (Accepted).” Pattern Recognition 170 (February 1, 2026). https://doi.org/10.1016/j.patcog.2025.112041.
Wang S, Yan Y, Yang X, Zhang R, Wang Q, Cheng G, et al. Point2pix-Zero: Point-driven refined diffusion for multi-object image editing (Accepted). Pattern Recognition. 2026 Feb 1;170.
Wang, S., et al. “Point2pix-Zero: Point-driven refined diffusion for multi-object image editing (Accepted).” Pattern Recognition, vol. 170, Feb. 2026. Scopus, doi:10.1016/j.patcog.2025.112041.
Wang S, Yan Y, Yang X, Zhang R, Wang Q, Cheng G, Huang K. Point2pix-Zero: Point-driven refined diffusion for multi-object image editing (Accepted). Pattern Recognition. 2026 Feb 1;170.
Journal cover image

Published In

Pattern Recognition

DOI

ISSN

0031-3203

Publication Date

February 1, 2026

Volume

170

Related Subject Headings

  • Artificial Intelligence & Image Processing
  • 4611 Machine learning
  • 4605 Data management and data science
  • 4603 Computer vision and multimedia computation
  • 0906 Electrical and Electronic Engineering
  • 0806 Information Systems
  • 0801 Artificial Intelligence and Image Processing