Home     Topics     Book     Talks     Publications     Links     Conferences     History           

Data Privacy


Big data:
  • Data protection and big data:
    • G. D'Acquisto, J. Domingo-Ferrer, P. Kikiras, V. Torra, Y.-A. de Montjoye, A. Bourka (2015) Privacy by design in big data: An overview of privacy enhancing technologies in the era of big data analytics, European Union Agency for Network and Information Security (ENISA), 2015. ISBN: 978-92-9204-160-1, DOI: 10.2824/641480. (Open access)
    • V. Torra, G. Navarro-Arribas (2016) Big Data Privacy and Anonymization, In Privacy and Identity Management 15-26. (Open access) Privacy by design in big data: An overview of privacy enhancing technologies in the era of big data analytics, European Union Agency for Network and Information Security (ENISA), 2015. ISBN: 978-92-9204-160-1, DOI: 10.2824/641480. downloadable from here

Protection Methods:
  • Review articles on data protecction procedures for numerical and categorical data. First extensive comparison of masking methods with respect to risk and utility/information loss:
    • Domingo-Ferrer, J., Torra, V., (2001) Disclosure control methods and information loss for microdata, Confidentiality, disclosure, and data access : Theory and practical applications for statistical agencies, Doyle, P., Lane, J.I., Theeuwes, J.J.M., Zayatz, L.V. eds., Elsevier, pp. 91-110.PDF@URV
    • Domingo-Ferrer, J., Torra, V., (2001) A quantitative comparison of disclosure control methods for microdata, Confidentiality, disclosure, and data access : Theory and practical applications for statistical agencies. Doyle, P.; Lane, J.I.; Theeuwes, J.J.M.; Zayatz, L.V. eds., Elsevier, pp. 111-133. PDF@URV
  • Microaggregation: More information here
  • Data protection for (numerical) temporal data (longitudinal data):
    • Nin, J., Torra, V. (2006) Extending microaggregation procedures for time series protection, Lecture Notes in Artificial Intelligence, 4259 899-908. (5th Int. Conf. on Rough Sets and Current Trends in Computing, RSCTC RSCTC 2006). http://dx.doi.org/10.1007/11908029_93
    • Nin, J., Torra, V. (2009) Towards The Evaluation of Time Series Protection Methods. Information Sciences, Elsevier, 179:11 1663-1677. http://dx.doi.org/10.1016/j.ins.2009.01.024

IL Measures:
  • Information Loss and Data Utility (generic measures):
    • Domingo-Ferrer, J., Torra, V., (2001) Disclosure control methods and information loss for microdata, Confidentiality, disclosure, and data access : Theory and practical applications for statistical agencies, Doyle, P., Lane, J.I., Theeuwes, J.J.M., Zayatz, L.V. eds., Elsevier, pp. 91-110.PDF@URV
    • Domingo-Ferrer, J., Torra, V., (2001) A quantitative comparison of disclosure control methods for microdata, Confidentiality, disclosure, and data access : Theory and practical applications for statistical agencies. Doyle, P.; Lane, J.I.; Theeuwes, J.J.M.; Zayatz, L.V. eds., Elsevier, pp. 111-133. PDF@URV

DR (generic) Measures:
  • Disclosure risk measures (generic measures using re-identification algorithms suitable for any data protection method):
    • Nin, J., Herranz, J., Torra, V. (2008) Towards a More Realistic Disclosure Risk Assessment.In Privacy in Statistical Databases (PSD), volume 5262 of Lecture Notes in Computer Science, pages 152-165. Springer. PDF@Springer
    • Torra, V., Abowd, J.M., Domingo-Ferrer, J. (2006) Using Mahalanobis Distance-Based Record Linkage for Disclosure Risk Assessment, Lecture Notes in Computer Science, 4302, 233-242 (PSD 2006). PDF@Cornell (In this paper we prove that record linkage and re-identification algorithms can also be used for evaluating the risk of synthetic data. We use use both probabilistic and distance-based record linkage. Several different distance were used, including Mahalanobis and Kernel-based distances.)
    • Nin, J., Torra, V. (2006) Distance based re-identification for time series, Analysis of distances, Lecture Notes in Computer Science, 4302 205-216. (PSD 2006). PDF@Springer (On the evaluation of disclosure risk of data protection methods on numerical time series.)
    • Torra, V., Domingo-Ferrer, J. (2003) Record linkage methods for multidatabase data mining, in V. Torra (Ed), Information fusion in data mining, Springer, ISBN 3-540-00676-1, 101-132. (This paper reviews in detail record linkage methods: distance-based and probabilistic.)

DR (specific - adhoc) Measures:
  • Disclosure risk measures (specific measures -- adhoc measures -- developed to attack particular data protection methods). These measures and studies are needed when we expect data releases follow the transparency principle. Information about transparency here.
    • Nin, J., Herranz, J., Torra V. (2008) Rethinking Rank Swapping to Decrease Disclosure Risk, Data and Knowledge Engineering, 64:1 346-364. http://dx.doi.org/10.1016/j.datak.2007.07.006 (This paper describes an effective attack for the data protection method Rank Swapping.)
    • Nin, J., Torra V. (2009) Analysis of the Univariate Microaggregation Disclosure Risk, New Generation Computing, 27 177-194. PDF@Springer (This paper describes an effective attack for univariate microaggregation, a data protection method.)
    • Nin, J., Herranz, J., Torra V. (2008) On the Disclosure Risk of Multivariate Microaggregation, Data and Knowledge Engineering, 67 399-412. http://dx.doi.org/10.1016/j.datak.2008.06.014 (This paper describes an effective attack for multivariate microaggregation, another data protection method.)

General:
  • Transactions on Data Privacy
  • G. Navarro-Arribas, G., Torra, V. (eds.) (2015) Advanced Research in Data Privacy, Springer. Book @ Springer
    • This book gives an overview of the main topics on data privacy. The book is a result of the ARES CONSOLIDER (CSD2007-00004) project
  • Edited volumes:
    • Privacy in Data Mining, Special issue of, Data Mining and Knowledge Discovery 11:2 (2005), Springer, J. Domingo-Ferrer, V. Torra (Eds). Includes:
      • A Framework for Evaluating Privacy Preserving Data Mining Algorithms, by Elisa Bertino, Igor Nai Fovino and Loredana Parasiliti Provenza
      • Preserving the Confidentiality of Categorical Statistical Data Bases When Releasing Information for Association Rules, by S. E. Fienberg and A. B. Slavkovic
      • Probabilistic Information Loss Measures in Confidentiality Protection of Continuous Microdata, by J. M. Mateo-Sanz, J. Domingo-Ferrer and F. Sebé
      • Ordinal, Continuous and Heterogeneous k -Anonymity Through Microaggregation, by J. Domingo-Ferrer and V. Torra
    • Privacy in Statistical Databases 2004, Lecture Notes in Computer Science, 3050 (2004), Springer. J. Domingo-Ferrer, V. Torra (Eds.)
    • Includes:
      • V. Torra, Microaggregation for Categorical Variables: A Median Based Approach (LNCS 3050 (2004) 162-174). PDF @ Springer
      • V. Torra, S. Miyamoto, Evaluating Fuzzy Clustering Algorithms for Microdata Protection (LNCS 3050 (2004) 175-186). PDF @ Springer
    • Special issue on Aggregation and re-identification in Statistical Disclosure Control, Int. J. of Uncertain Fuzziness and Knowledge-Based Systems, 10:5 (2002), Springer. V. Torra, J. Domingo-Ferrer (Eds.) Includes:
      • V. Torra, J. Domingo-Ferrer, Editorial: Trends in aggregation and security assessment for inference control in statistical databases (pp. 453 - 457).
      • J. Domingo-Ferrer, V. Torra, A critique of the sensitivity rules usually employed for statistical table protection (pp. 545-556).
      • L. Sweeney, k-Anonymity: a model for protecting privacy (pp. 557 - 570).
      • L. Sweeney, Achieving k-anonymity privacy protection using generalization and suppression (pp. 571 - 588).