Skip to main navigation Skip to search Skip to main content

A Systematic Evaluation of Generative Models on Tabular Transportation Data

  • Chengen Wang
  • , Alvaro A. Cardenas
  • , Gurcan Comert
  • , Murat Kantarcioglu

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

The sharing of large-scale transportation data is beneficial for transportation planning and policymaking; however, there are privacy concerns with data sharing, as it can include identifiable personal information, such as individuals’ home locations. To address these concerns, synthetic data generation based on real transportation data offers a promising solution that allows privacy protection while potentially preserving data utility. Although there are various synthetic data generation techniques, they are often not tailored to the unique characteristics of transportation networks. In this paper, we use New York City taxi data as a case study to conduct a systematic evaluation of the performance of widely used tabular data generative models. In addition to traditional metrics such as distribution similarity, coverage, and privacy preservation, we propose a novel graph-based metric tailored specifically for transportation data. This metric evaluates the similarity between real and synthetic transportation networks, providing potentially deeper insights into their structural and functional alignment. We also introduce an improved privacy metric to address the limitations of current metrics. Our experimental results reveal that existing tabular data generative models often fail to perform as consistently as claimed in the literature, particularly when applied to transportation data. Furthermore, our novel graph metric reveals a significant gap between synthetic and real data. This work underscores the need to develop generative models that take advantage of the unique characteristics of transportation networks. Full version athttps://www.arxiv.org/abs/2502.08856.
Original languageEnglish
Title of host publicationUnknown book
PublisherSpringer Science and Business Media Deutschland GmbH
DOIs
StatePublished - 2025

UN SDGs

This output contributes to the following UN Sustainable Development Goals (SDGs)

  1. SDG 11 - Sustainable Cities and Communities
    SDG 11 Sustainable Cities and Communities

Fingerprint

Dive into the research topics of 'A Systematic Evaluation of Generative Models on Tabular Transportation Data'. Together they form a unique fingerprint.

Cite this