Reconstructing Training Data from Multiclass Neural Networks

Gon Buzaglo, Niv Haim, Gilad Yehudai, Gal Vardi, Michal Irani

Research output: Contribution to journalArticle

Abstract

Reconstructing samples from the training set of trained neural networks is a major privacy concern. Haim et al. (2022) recently showed that it is possible to reconstruct training samples from neural network binary classifiers, based on theoretical results about the implicit bias of gradient methods. In this work, we present several improvements and new insights over this previous work. As our main improvement, we show that training-data reconstruction is possible in the multi-class setting and that the reconstruction quality is even higher than in the case of binary classification. Moreover, we show that using weight-decay during training increases the vulnerability to sample reconstruction. Finally, while in the previous work the training set was of size at most 1000 from 10 classes, we show preliminary evidence of the ability to reconstruct from a model trained on 5000 samples from 100 classes.
Original languageUndefined/Unknown
Number of pages10
JournalarXiv.org
Publication statusIn preparation - May 2023

Bibliographical note

This project received funding from the European Research Council (ERC) under the European Union’s Horizon 2020 research and innovation programme (grant agreement No 788535), and ERC grant 754705, and from the D. Dan and Betty Kahn Foundation. GV acknowledges the support of the NSF and the Simons Foundation for the Collaboration on the Theoretical Foundations of Deep Learning.

Cite this