Scholars@Duke publication: Cross-domain document layout analysis using document style guide

Cross-domain document layout analysis using document style guide

Publication , Journal Article

Wu, X; Xiao, L; Du, X; Zheng, Y; Li, X; Ma, T; Jin, C; He, L

Published in: Expert Systems with Applications

July 1, 2024

Document layout analysis (DLA) is a crucial computer vision task that involves partitioning document images into high-level semantic regions such as figures, tables, backgrounds, and texts. Deep learning models for DLA typically require a large amount of labeled data, which can be expensive. Though some researchers use generated data for training, a substantial style gap exists between the generated and target data. Moreover, it is necessary to improve the quality of the generated samples to achieve better control. To address these challenges, we propose a cross-domain DLA framework called DL-DSG, which leverages document-style guidance. DL-DSG comprises three components: the document layout generator (DLG) responsible for generating document element locations, the document element decorator (DED) for filling the elements, and the document style discriminator (DSD) for style guidance. In addition to generating controlled documents, we also focus on bridging the gap between the generated and target samples. To this end, we introduce a novel strategy that transforms document style judgment into the document cross-domain style guidance component. We evaluate the effectiveness of DL-DSG on popular DLA datasets, including PubLayNet, DSSE-200, CS-150, and CDSSE, and demonstrate its superior performance.

Duke Scholars

Author Xin Li Electrical and Computer Engineering

Published In

Expert Systems with Applications

DOI

10.1016/j.eswa.2023.123039

ISSN

0957-4174

Publication Date

July 1, 2024

Volume

245

Related Subject Headings

Artificial Intelligence & Image Processing
09 Engineering
08 Information and Computing Sciences
01 Mathematical Sciences

Citation

APA

Chicago

ICMJE

MLA

NLM

Wu, X., Xiao, L., Du, X., Zheng, Y., Li, X., Ma, T., … He, L. (2024). Cross-domain document layout analysis using document style guide. Expert Systems with Applications, 245. https://doi.org/10.1016/j.eswa.2023.123039

Wu, X., L. Xiao, X. Du, Y. Zheng, X. Li, T. Ma, C. Jin, and L. He. “Cross-domain document layout analysis using document style guide.” Expert Systems with Applications 245 (July 1, 2024). https://doi.org/10.1016/j.eswa.2023.123039.

Wu X, Xiao L, Du X, Zheng Y, Li X, Ma T, et al. Cross-domain document layout analysis using document style guide. Expert Systems with Applications. 2024 Jul 1;245.

Wu, X., et al. “Cross-domain document layout analysis using document style guide.” Expert Systems with Applications, vol. 245, July 2024. Scopus, doi:10.1016/j.eswa.2023.123039.

Wu X, Xiao L, Du X, Zheng Y, Li X, Ma T, Jin C, He L. Cross-domain document layout analysis using document style guide. Expert Systems with Applications. 2024 Jul 1;245.

Published In

Expert Systems with Applications

DOI

10.1016/j.eswa.2023.123039

ISSN

0957-4174

Publication Date

July 1, 2024

Volume

245

Related Subject Headings

Artificial Intelligence & Image Processing
09 Engineering
08 Information and Computing Sciences
01 Mathematical Sciences