Skip to main content

Document Registration: Towards Automated Labeling of Pixel-Level Alignment Between Warped-Flat Documents

Publication ,  Conference
Zhang, W; Wang, Q; Huang, K; Huang, X; Guo, F; Gu, X
Published in: MM 2024 - Proceedings of the 32nd ACM International Conference on Multimedia
October 28, 2024

Photographed documents are prevalent but often suffer from deformations like curves or folds, hindering readability. Consequently, document dewarping has been widely studied, however its performance is still not satisfied due to lack of real training samples with pixel-level annotation. To obtain the pixel-level labels, we leverage a document registration pipeline to automatically align warped-flat documents. Unlike general image registration works, registering documents poses unique challenges due to their severe deformations and fine-grained textures. In this paper, we introduce a coarse-to-fine framework including a coarse registration network (CRN) aiming to eliminate severe deformations then a fine registration network (FRN) focusing on fine-grained features. In addition, we utilize self-supervised learning to initialize our document registration model, where we propose a cross-reconstruction pre-training task on the pair of warped-flat documents. Extensive experiments show that we can achieve satisfied document registration performance, consequently obtaining a high-quality registered document dataset with pixel-level annotation. Without bells and whistles, we re-train two popular document dewarping models on our registered document dataset WarpDoc-R, and obtain superior performance with those using almost 100× scale of synthetic training data, verifying the label quality of our document registration method.

Duke Scholars

Published In

MM 2024 - Proceedings of the 32nd ACM International Conference on Multimedia

DOI

Publication Date

October 28, 2024

Start / End Page

9933 / 9942
 

Citation

APA
Chicago
ICMJE
MLA
NLM
Zhang, W., Wang, Q., Huang, K., Huang, X., Guo, F., & Gu, X. (2024). Document Registration: Towards Automated Labeling of Pixel-Level Alignment Between Warped-Flat Documents. In MM 2024 - Proceedings of the 32nd ACM International Conference on Multimedia (pp. 9933–9942). https://doi.org/10.1145/3664647.3681548
Zhang, W., Q. Wang, K. Huang, X. Huang, F. Guo, and X. Gu. “Document Registration: Towards Automated Labeling of Pixel-Level Alignment Between Warped-Flat Documents.” In MM 2024 - Proceedings of the 32nd ACM International Conference on Multimedia, 9933–42, 2024. https://doi.org/10.1145/3664647.3681548.
Zhang W, Wang Q, Huang K, Huang X, Guo F, Gu X. Document Registration: Towards Automated Labeling of Pixel-Level Alignment Between Warped-Flat Documents. In: MM 2024 - Proceedings of the 32nd ACM International Conference on Multimedia. 2024. p. 9933–42.
Zhang, W., et al. “Document Registration: Towards Automated Labeling of Pixel-Level Alignment Between Warped-Flat Documents.” MM 2024 - Proceedings of the 32nd ACM International Conference on Multimedia, 2024, pp. 9933–42. Scopus, doi:10.1145/3664647.3681548.
Zhang W, Wang Q, Huang K, Huang X, Guo F, Gu X. Document Registration: Towards Automated Labeling of Pixel-Level Alignment Between Warped-Flat Documents. MM 2024 - Proceedings of the 32nd ACM International Conference on Multimedia. 2024. p. 9933–9942.

Published In

MM 2024 - Proceedings of the 32nd ACM International Conference on Multimedia

DOI

Publication Date

October 28, 2024

Start / End Page

9933 / 9942