Process Discovery Contest 2022

DOI:10.4121/21261402.v2
The DOI displayed above is for this specific version of this dataset, which is currently the latest. Newer versions may be published in the future. For a link that will always point to the latest version, please use
DOI: 10.4121/21261402
Datacite citation style:
Eric Verbeek (2025): Process Discovery Contest 2022. Version 2. 4TU.ResearchData. dataset. https://doi.org/10.4121/21261402.v2
Other citation styles (APA, Harvard, MLA, Vancouver, Chicago, IEEE) available at Datacite

Dataset

choose version:
version 2 - 2025-02-10 (latest)
version 1 - 2022-10-03

This data set contains the data set as was used for the Process Discovery Contest of 2022 (PDC 2022). The data set contains 480 training logs, 96 corresponding test logs, 96 corresponding ground truth logs, and 96 models. The logs are all stored using the IEEE XES file format (see either https://www.xes-standard.org/ or https://ieeexplore.ieee.org/document/7740858), while the models are workflow nets (a subclass of Petri nets) stored in the PNML file

format (see https://www.iso.org/obp/ui/#iso:std:iso-iec:15909:-2:ed-1:v1:en). 

History

  • 2022-10-03 first online
  • 2025-02-10 published, posted

Publisher

4TU.ResearchData

Format

IEEE XESISO PNML

Organizations

Task Force on Process Mining (https://tf-pm.org)

DATA

Files (6)