Data di Pubblicazione:
2022
Abstract:
The field of statistical disclosure control aims to reduce the risk of re-identifying an individual from disseminated data, a major concern among national statistical agencies. Operations Research (OR) techniques have been widely used in the past for protecting tabular data, but not microdata (i.e., files of individuals and attributes). Few papers apply OR techniques to the microaggregation problem, which is considered one of the best methods for microdata protection and is known to be NP-hard. The new heuristic approach is based on a column generation scheme and, unlike previous (primal) heuristics for microaggregation, it also provides a lower bound on the optimal microaggregation. Using real data that is typically used in the literature, our computational results show, first, that solutions with small gaps are often achieved and, second, that dramatic improvements are obtained relative to the literature's most popular heuristics.
Tipologia CRIS:
01.01 Articolo in rivista
Keywords:
Integer Programming; Column Generation; Data Privacy; Custering; Microaggregation
Elenco autori:
Gentile, Claudio
Link alla scheda completa:
Pubblicato in: