Deep Generative Models for Synthetic Data

Eigenschink, Peter ORCID: and Vamosi, Stefan and Vamosi, Ralf and Sun, Chang and Reutterer, Thomas and Kalcher, Klaudius (2021) Deep Generative Models for Synthetic Data. ACM Computing Surveys. ISSN 0360-0300


Download (627kB) | Preview


Growing interest in synthetic data has stimulated development and advancement of a large variety of deep generative models for a wide range of applications. However, as this research has progressed, its streams have become more specialized and disconnected from each other. For example, models for synthesizing text data for natural language processing cannot readily be compared to models for synthesizing health records. To mitigate this isolation, we propose a data-driven evaluation framework for generative models for synthetic data based on five high-level criteria: representativeness, novelty, realism, diversity and coherence of a synthetic data sample relative to the original data-set regardless of the models' internal structures. The criteria reflect requirements different domains impose on synthetic data and allow model users to assess the quality of synthetic data across models. In a critical review of generative models for sequential data, we examine and compare the importance of each performance criterion in numerous domains. For example, we find that realism and coherence are more important for synthetic data for natural language, speech and audio processing, while novelty and representativeness are more important for healthcare and mobility data. We also find that measurement of representativeness is often accomplished using statistical metrics, realism by using human judgement, and novelty using privacy tests.

Item Type: Article
Keywords: artificial intelligence, neural networks, deep learning, generative models, synthetic data, sequential data, big data, privacy protection
Divisions: Departments > Marketing
Departments > Marketing > Service Marketing und Tourismus
Version of the Document: Draft
Variance from Published Version: Not applicable
Depositing User: Peter Eigenschink
Date Deposited: 23 Nov 2021 08:52
Last Modified: 23 Nov 2021 08:52


View Item View Item


Downloads per month over past year

View more statistics