Data Taggants: Dataset Ownership Verification via Harmless Targeted Data Poisoning
Wassim (Wes) Bouaziz, Nicolas Usunier, El Mahdi El Mhamdi
In this work, we introduce Data Taggants, a novel approach for dataset ownership verification that leverages targeted data poisoning. Our method allows dataset owners to embed unique identifiers into their datasets, enabling them to prove ownership without compromising the integrity of the data. This is achieved by injecting harmless perturbations that do not affect the performance of models trained on the dataset. This approach not only proves to be more effective than existing methods but also offers theoretical guarantees on the false detection rate.