Membership inference attacks in generative AI

Membership inference attacks in generative AI
June 30, 2023
Vatsal Shah


Membership inference attacks pose a risk to the privacy of machine learning models by attempting to deduce whether a data instance was included in the model's training set. By analyzing outputs and confidence scores for a target example, especially outliers, an adversary can determine if a data point was likely used during training. This threatens the privacy of sensitive applications, like generative models, where identification of individuals' participation could violate their privacy.

To mitigate such risks, federated learning is a technique where models are trained on decentralized data that remains locally on user devices. This approach allows useful machine learning to be built without requiring personal data to be centralized or shared.

Federated learning methods and infrastructure are making generative AI accessible and responsible.

This technical blog post will provide an overview of how membership inference attacks against generative models work and discuss how federated learning defends privacy through its distributed approach. We explore remaining challenges, opportunities around optimization, and the need for standards and responsible practices. Overall, federated learning shows potential for developing generative AI if implemented ethically and with comprehensive protections.

The Privacy Threat: Membership Inference Attacks

Membership inference attacks aim to deduce whether a target data instance x was included in the training set D of a machine learning model f. By analyzing f's outputs and confidence scores for new examples, especially outliers, an adversary can make probabilistic inferences about x's membership in D. This violates the privacy of sensitive applications where individuals' participation in model training could expose them to harm.

How They Work Against Generative Models

Generative models learn representations of training data to synthesize new samples. Researchers have shown attackers can generate many samples from these models and analyze them to deduce details about the private training data.

The probability that a target was included can be expressed as:

P(target ∈ D|g, G, x) = P(g(x;θg) is implausible in G| target ∉ D)× P(target ∉ D)  / P(g(x;θg) is implausible in G) [1]

Where g generates samples G, trained on D. If g(x;θg) seems anomalous in G, that suggests the target x was not in D, so it likely was. Attackers manipulate g(z;θg) by varying noise z or hyperparameters θg when generating for x. If outputs remain implausible, that signifies x was unlike all the data g has learned to represent, as it was in D.

For example, a facial model trained only on humans may represent animals implausibly. But if a human face elicits similar outputs, it likely reflects private data the model lacked to learn. Unrealistic aggregates provide stronger evidence than isolated instances due to limitations in data and representations. Adversaries query diverse models and targets to reduce false inferences from insufficient data or sampling. Without direct access or inversion, certainty remains unlikely. Implausibility metrics help adversaries systematically gauge if a target reflects unseen data, but definitions vary per application and adversary goals.

Generative AI enables customized data and services but also poses risks to privacy and trust if misused.

Examples of Vulnerable Models: Attacking MLaaS Platforms

Membership inference also threatens machine learning as a service (MLaaS) platforms where models are trained on pooled client data. As Choquette-Choo et al. show, a facial GAN on an MLaaS platform leaked private details about D through generated samples. [2]  Their attack accuracy reached over 90% in identifying members of D, demonstrating serious privacy risks with pooled training. Machine learning as a service platforms also risk this form of privacy leakage, as private training data from many clients is pooled to build single virtual models.

Sensitive Use Cases at Risk

Sensitive domains like healthcare, finance, and education face serious privacy risks if membership inference compromises their machine learning models.

Healthcare organizations using generative ML for applications like medical imaging analysis or diagnosis risk patient re-identification if models leak membership details. A neural network trained on chest X-ray data could be vulnerable to membership inference, exposing patients' conditions.

Financial firms applying generative AI to applications such as fraud detection also face risks, as malicious actors could determine that high-value account details were likely part of the training data. A  model trained to detect illegitimate transactions could leak private customer data through membership inference.

The Federated Learning Approach

Federated learning enables the development of machine learning models without requiring the centralized aggregation of sensitive data. The approach trains models on decentralized data that remains local to each user or device, with only model updates shared—not raw data. This provides privacy benefits over pooled data while allowing useful global models to be built. However, responsible development demands systematically addressing risks around access and use at each node. Success depends on governance and safeguards—not technique alone.

Decentralized Training on Local Data

In federated learning, training data remains in decentralized silos, with users updating local model replicas that submit compressed updates to the central server. Updates are aggregated to build a shared global model reflecting patterns across the decentralized network data. Sensitive details stay protected locally without the risk of contributing to a central pool, but data is still used for local training and updating, presenting some residual risks around use and access.

For example, hospitals could train local X-ray anomaly detection models on private records, sending updates to build a global model for diagnosing patients at any facility. Updates rather than raw data are shared, mitigating privacy risks to patients—but depending on policies and access controls around local training data. Success requires systematically addressing the risks of use and management at each node.

Privacy Benefits of Local Data

Federated learning aims for useful global models built from decentralized data through sharing model updates rather than raw data. By avoiding a central pool of sensitive details, threats like membership inference or re-identification from aggregated data are mitigated.


In summary, membership inference poses serious risks to privacy with machine learning if not addressed, as adversaries can exploit models to deduce whether a target's data was likely used in training. Sensitive domains face disproportionate threats that demand solutions that balance accuracy and privacy. Federated learning shows promise by training models on decentralized data through sharing updates rather than raw details. While federated learning offers mechanisms to share insights from dispersed data, its success depends on rigor and cooperation, recognizing privacy as a matter of equity, not an obstacle.


[1] Reza Shokri, Marco Stronati, Congzheng Song, Vitaly Shmatikov: “Membership Inference Attacks against Machine Learning Models”, 2016

[2] Christopher A. Choquette-Choo, Florian Tramer, Nicholas Carlini, Nicolas Papernot: “Label-Only Membership Inference Attacks”, 2020

[3] Breugel, B. V., Sun, H., Qian, Z., & der Schaar, M. V. (2023, February 24). Membership Inference Attacks against Synthetic Data through Overfitting Detection.

[4] K. S. Liu, C. Xiao, B. Li and J. Gao, "Performing Co-membership Attacks Against Deep Generative Models," 2019 IEEE International Conference on Data Mining (ICDM), Beijing, China, 2019, pp. 459-467, doi: 10.1109/ICDM.2019.00056.

[5] C. Park, Y. Kim, J. -G. Park, D. Hong and C. Seo, "Evaluating Differentially Private Generative Adversarial Networks Over Membership Inference Attack," in IEEE Access, vol. 9, pp. 167412-167425, 2021, doi: 10.1109/ACCESS.2021.3137278.

About DataBloom

Blossom Sky is all about taking data collaboration and efficiency to the next level. Our platform tackles the big challenge of data silos, bringing everything together in one easy-to-use system. It's built to work smoothly with a whole range of AI algorithms and models.

The cool part? Blossom Sky works hand-in-hand with top data frameworks like Databricks, Snowflake, Cloudera, and others, including Hadoop, Teradata, and Oracle. Plus, it's fully compatible with AI favorites like TensorFlow, Pandas, and PyTorch. We've made sure it fits right into your existing setup.

Blossom Sky is the commercial version of Apache Wayang, and we're proud to offer it as Open Source Software. You can check out our public GitHub repo right here. If you're enjoying our software, we'd love your support - a star ⭐ would mean a lot to us!

If you need professional support from our team of industry leading experts, you can always reach out to us via Slack or Email.
back to all articlesFollow us on Google News
Ready to Elevate Your Data Experience? Get a quote today!