Finding diverse, high-value representatives on a surface of answers
In many applications, the system needs to selectively present a small subset of answers to users. The set of all possible answers can be seen as an elevation surface over a domain, where the elevation measures the quality of each answer, and the dimensions of the domain correspond to attributes of the answers with which similarity between answers can be measured. This paper considers the problem of finding a diverse set of k high-quality representatives for such a surface. We show that existing methods for diversified top-k and weighted clustering problems are inadequate for this problem. We propose k-DHR as a better formulation for the problem. We show that k-DHR has a submodular and monotone objective function, and we develop efficient algorithms for solving k-DHR with provable guarantees. We conduct extensive experiments to demonstrate the usefulness of the results produced by k-DHR for applications in computational lead-finding and fact-checking, as well as the efficiency and effectiveness of our algorithms.
Duke Scholars
Published In
DOI
EISSN
Publication Date
Volume
Issue
Start / End Page
Related Subject Headings
- 4605 Data management and data science
- 0807 Library and Information Studies
- 0806 Information Systems
- 0802 Computation Theory and Mathematics
Citation
Published In
DOI
EISSN
Publication Date
Volume
Issue
Start / End Page
Related Subject Headings
- 4605 Data management and data science
- 0807 Library and Information Studies
- 0806 Information Systems
- 0802 Computation Theory and Mathematics