Bayesian topic models for describing computer network behaviors
We consider the use of Bayesian topic models in the analysis of computer network traffic. Our approach utilizes latent Dirichlet allocation and time-varying dynamic latent Dirichlet allocation, with the goal of identifying significant co-occurrences of types of network traffic, these forming topics of user behavior. In our experiments, these topics of user behavior included: (i) web traffic, (ii) email client and instant messaging, (iii) Microsoft file access, (iv) email server, and (v) other miscellaneous traffic. Each identified behavior topic included a variety of different, but related, protocols without using any a priori knowledge of the purpose of the protocol. We believe that the techniques presented in this paper can be used to form more complex topics through the use of deep packet inspection, and that such topic models could prove useful in the identification of zero-day exploits or other network threats. © 2011 IEEE.