Finding needles in compressed haystacks
In this chapter, we show that compressed learning, learning directly in the compressed domain, is possible. In particular, we provide tight bounds demonstrating that the linear kernel SVM's classifier in the measurement domain, with high probability, has true accuracy close to the accuracy of the best linear threshold classifier in the data domain. We show that this is beneficial both from the compressed sensing and the machine learning points of view. Furthermore, we indicate that for a family of well-known compressed sensing matrices, compressed learning is provided on the fly. Finally, we support our claims with experimental results in the texture analysis application. Introduction In many applications, the data has a sparse representation in some basis in a much higher dimensional space. Examples are the sparse representation of images in the wavelet domain, the bag of words model of text, and the routing tables in data monitoring systems. Compressed sensing combines measurement to reduce the dimensionality of the underlying data with reconstruction to recover sparse data from the projection in the measurement domain. However, there are many sensing applications where the objective is not full reconstruction but is instead classification with respect to some signature. Examples include radar, detection of trace chemicals, face detection [7, 8], and video streaming [9] where we might be interested in anomalies corresponding to changes in wavelet coefficients in the data domain. In all these cases our objective is pattern recognition in the measurement domain.