Using relative-relevance of data pieces for efficient communication, with an application to Neural data acquisition
In this paper, we consider the problem of communicating data from distributed sensors for the goal of inference. Two inference problems of linear regression and binary linear classification are investigated. Assuming perfect training of the classifier, an approximation of the problem of minimizing classification error-probability under Gaussianity assumptions leads us to recover Fisher score: a metric that is commonly used for feature selection in machine learning. Further, this allows us to soften the notion of feature selection by assigning a degree of relevance to each feature based on the number of bits assigned to it. This relative relevance is used to obtain numerical results on savings on number of bits acquired and communicated for classification of neural data obtained from Electrocorticography (ECoG) experiments. The results demonstrate that significant savings on costs of communication can be achieved by compressing Big Data at the source.