Sensitive Label Privacy Protection on Social Network Data

No Thumbnail Available
Date
2012-03-30
Journal Title
Journal ISSN
Volume Title
Publisher
Abstract
The publication of social network data presents opportunities for data mining and analytics for strategic public, commercial and academic applications. Yet the publication of social network data entails a privacy threat for their users. Sensitive information should be protected. The challenge is to devise methods to publish these data in a form that affords utility without compromising privacy. Previous research has proposed various privacy models with the corresponding protection mechanisms. These early privacy models are mostly concerned with identity and link disclosure. The social networks are modeled as graphs in which users are nodes. The threat definitions and the protection mechanisms leverage structural properties of the graph. This paper is motivated by the recognition of the need for a finer grain and more personalized privacy. We propose a privacy protection scheme that not only prevents the disclosure of identity of users but also the disclosure of selected features in users' profiles. An individual user can select which features of her profile she wishes to conceal. The social networks are modeled as graphs in which users are nodes and features are labels. Labels are denoted either as sensitive or as non-sensitive. We treat node labels both as background knowledge an adversary may possess, and as sensitive information that has to be protected. We present privacy protection algorithms that allow for graph data to be published in a form such that an adversary who possesses information about a node's neighborhood cannot safely infer its identity and its sensitive labels. To this aim, the algorithms transform the original graph into a graph in which nodes are sufficiently indistinguishable. The algorithms are designed to do so while losing as little information and while preserving as much utility as possible. We evaluate empirically the extent to which the algorithms preserve the original graph's structure and properties. We show that our solution is effective, efficient and scalable while offering stronger privacy guarantees than those in previous research.
Description
Keywords
Citation