Metering graphical data leakage with snowman
A long-standing technique to interfere with theft of sensitive data by its intended users is permitting these insiders only remote access to the data via a thin client. Even allowing only remote access is inadequate, however, to counter an insider willing to reconstruct the data from the graphical output, in the limit by photographing the data on-screen and applying automatic character recognition to these photographs offline. In this paper we propose and evaluate a system, called Snowman, that accurately monitors the amount of sensitive data output to a client. To conduct this monitoring without slowing the interactive user session, leakage is concurrently tracked in a replica of the application execution. This, in turn, introduces a key technical challenge that Snowman solves, namely identically replicating execution of an unmodified Linux binary while also performing efficient multi-label taint-tracking on it. We show through empirical measurements with a word processor, a spreadsheet program, and a code editor that Snowman induces little overhead on interactive user sessions and easily differentiates data-access patterns induced by normal usage and sufficiently aggressive data theft with reasonable responsiveness.