Skip to main content

Multidimensional mining of large-scale search logs: A topic-concept cube approach

Publication ,  Conference
Kang, D; Jiang, D; Pei, J; Liao, Z; Sun, X; Choi, HJ
Published in: Proceedings of the 4th ACM International Conference on Web Search and Data Mining, WSDM 2011
March 14, 2011

In addition to search queries and the corresponding clickthrough information, search engine logs record multidimensional information about user search activities, such as search time, location, vertical, and search device. Multidimensional mining of search logs can provide novel insights and useful knowledge for both search engine users and developers. In this paper, we describe our topic-concept cube project, which addresses the business need of supporting multidimensional mining of search logs effectively and efficiently. We answer two challenges. First, search queries and click-through data are well recognized sparse, and thus have to be aggregated properly for effective analysis. Second, there is often a gap between the topic hierarchies in multidimensional aggregate analysis and queries in search logs. To address those challenges, we develop a novel topicconcept model that learns a hierarchy of concepts and topics automatically from search logs. Enabled by the topicconcept model, we construct a topic-concept cube that supports online multidimensional mining of search log data. A distinct feature of our approach is that, in addition to the standard dimensions such as time and location, our topicconcept cube has a dimension of topics and concepts, which substantially facilitates the analysis of log data. To handle a huge amount of log data, we develop distributed algorithms for learning model parameters efficiently. We also devise approaches to computing a topic-concept cube. We report an empirical study verifying the effectiveness and efficiency of our approach on a real data set of 1.96 billion queries and 2.73 billion clicks. Copyright 2011 ACM.

Duke Scholars

Published In

Proceedings of the 4th ACM International Conference on Web Search and Data Mining, WSDM 2011

DOI

Publication Date

March 14, 2011

Start / End Page

385 / 394
 

Citation

APA
Chicago
ICMJE
MLA
NLM
Kang, D., Jiang, D., Pei, J., Liao, Z., Sun, X., & Choi, H. J. (2011). Multidimensional mining of large-scale search logs: A topic-concept cube approach. In Proceedings of the 4th ACM International Conference on Web Search and Data Mining, WSDM 2011 (pp. 385–394). https://doi.org/10.1145/1935826.1935888
Kang, D., D. Jiang, J. Pei, Z. Liao, X. Sun, and H. J. Choi. “Multidimensional mining of large-scale search logs: A topic-concept cube approach.” In Proceedings of the 4th ACM International Conference on Web Search and Data Mining, WSDM 2011, 385–94, 2011. https://doi.org/10.1145/1935826.1935888.
Kang D, Jiang D, Pei J, Liao Z, Sun X, Choi HJ. Multidimensional mining of large-scale search logs: A topic-concept cube approach. In: Proceedings of the 4th ACM International Conference on Web Search and Data Mining, WSDM 2011. 2011. p. 385–94.
Kang, D., et al. “Multidimensional mining of large-scale search logs: A topic-concept cube approach.” Proceedings of the 4th ACM International Conference on Web Search and Data Mining, WSDM 2011, 2011, pp. 385–94. Scopus, doi:10.1145/1935826.1935888.
Kang D, Jiang D, Pei J, Liao Z, Sun X, Choi HJ. Multidimensional mining of large-scale search logs: A topic-concept cube approach. Proceedings of the 4th ACM International Conference on Web Search and Data Mining, WSDM 2011. 2011. p. 385–394.

Published In

Proceedings of the 4th ACM International Conference on Web Search and Data Mining, WSDM 2011

DOI

Publication Date

March 14, 2011

Start / End Page

385 / 394