Coarse-to-Fine Document Image Registration for Dewarping
Document dewarping has made great progress in recent years, however it usually requires huge document pairs with pixel-level annotation to learn a mapping function. Although photographed document images are easy to obtain, the pixel-level annotation between warped and flat images is time-consuming and almost impossible for large-scale datasets. To overcome this issue, we propose to register photographed documents with corresponding flat counterparts, obtaining the auto-annotation of pixel-level mapping labels. Due to the severe deformation in the real photographed documents, we introduce a coarse-to-fine registration pipeline to learn global-scale transformation and local details alignment respectively. In addition, the lack of registration labels motivates us to tailor a teacher-student dual branch under semi-supervised training, where the model is initialized on synthetic documents with labels. Furthermore, we contribute a large-scale dataset containing 12,500 triplets of synthetic-real-flat documents. Extensive experiments demonstrate the effectiveness of our proposed registration method. Specifically, trained by our registered pixel-level documents, the dewarping model can obtain comparable performance with SOTAs trained by almost 100× scale of samples, showing the high quality of our registration results. Our dataset and code are available at https://github.com/hanquansanren/DIRD.
Duke Scholars
DOI
Publication Date
Volume
Start / End Page
Related Subject Headings
- Artificial Intelligence & Image Processing
- 46 Information and computing sciences
Citation
DOI
Publication Date
Volume
Start / End Page
Related Subject Headings
- Artificial Intelligence & Image Processing
- 46 Information and computing sciences