Siamese BERT for authorship verification

Conference Paper

The PAN 2021 authorship verification (AV) challenge focuses on determining if two texts are written by the same author or not, specifically when faced with new, unseen, authors. In our approach, we construct a Siamese network initialized with pretrained BERT encoders, employing a learning objective that incentives the model to map texts written by the same author to nearby embeddings while mapping texts written by different authors to comparatively distant embeddings. Additionally, inspired by related work in computer vision, we attempt to incorporate triplet losses but are unable to realize any benefit. Our method results in a slight performance gain of 0.9% overall score over the baseline and an increase of 8% in F1 score.

Duke Authors

Cited Authors

  • Tyo, J; Dhingra, B; Lipton, Z

Published Date

  • January 1, 2021

Published In

Volume / Issue

  • 2936 /

Start / End Page

  • 2169 - 2177

International Standard Serial Number (ISSN)

  • 1613-0073

Citation Source

  • Scopus