A propagation model for provenance views of public/private workflows


Conference Paper

We study the problem of concealing functionality of a proprietary or private module when provenance information is shown over repeated executions of a workflow which contains both public and private modules. Our approach is to use provenance views to hide carefully chosen subsets of data over all executions of the workflow to ensure Γ-privacy: for each private module and each input x, the module's output f(x) is indistinguishable from Γ-1 other possible values given the visible data in the workflow executions. We show that Γ-privacy cannot be achieved simply by combining solutions for individual private modules; data hiding must also be propagated through public modules. We then examine how much additional data must be hidden and when it is safe to stop propagating data hiding. The answer depends strongly on the workflow topology as well as the behavior of public modules on the visible data. In particular, for a class of workflows (which include the common tree and chain workflows), taking private solutions for each private module, augmented with a public closure that is upstream-downstream safe, ensures Γ-privacy. We define these notions formally and show that the restrictions are necessary. We also study the related optimization problems of minimizing the amount of hidden data. Copyright 2013 ACM.

Full Text

Duke Authors

Cited Authors

  • Davidson, SB; Milo, T; Roy, S

Published Date

  • January 1, 2013

Published In

  • Acm International Conference Proceeding Series

Start / End Page

  • 165 - 176

International Standard Book Number 13 (ISBN-13)

  • 9781450315982

Digital Object Identifier (DOI)

  • 10.1145/2448496.2448517

Citation Source

  • Scopus