This dataset contains the public event logs used in the experiment setup of the study
A. Augusto et al., "Automated Discovery of Process Models from Event Logs: Review and
Benchmark," in IEEE Transactions on Knowledge and Data Engineering, vol. 31, no. 4,
pp. 686-705, 1 April 2019. doi: 10.1109/TKDE.2018.2841877

The dataset contains 12 public event logs that have partially been processed for
the study. 

BPIC12 traces: 13,087 / events: 262,200 / trace length (min/avg/max): 3/20/175
BPIC13cp traces: 1,487 / events: 6,660 / trace length (min/avg/max): 1/4/35
BPIC13inc traces: 7,554 / events: 65,533 / trace length (min/avg/max): 1/9/123
BPIC14f traces: 41,353 / events: 369,485 / trace length (min/avg/max): 3/9/167
BPIC151f traces: 902 / events: 21,656 / trace length (min/avg/max): 5/24/50
BPIC152f traces: 681 / events: 24,678 / trace length (min/avg/max): 4/36/63
BPIC153f traces: 1,369 / events: 43,786 / trace length (min/avg/max): 4/32/54
BPIC154f traces: 860 / events: 29,403 / trace length (min/avg/max): 5/34/54
BPIC155f traces: 975 / events: 30,030 / trace length (min/avg/max): 4/31/61
BPIC17f traces: 21,861 / events: 714,198 / trace length (min/avg/max): 11/33/113
RTFMP traces: 150,370 / events: 561,470 / trace length (min/avg/max): 2/4/20
SEPSIS traces: 1,050 / events: 15,214 / trace length (min/avg/max): 3/14/185

Complete meta-data about each event log is available at the original dataset
as indicated below. Some logs underwent post-processing as indicated below.

BPIC12.xes.gz 

original dataset:
https://doi.org/10.4121/uuid:3926db30-f712-4394-aebc-75976070e91f
processing: none


BPIC13_cp.xes.gz

original dataset:
https://doi.org/10.4121/uuid:c2c3b154-ab26-4b31-a0e8-8f2350ddac11
processing: none


BPIC13_cp.xes.gz

original dataset:
https://doi.org/10.4121/uuid:500573e6-accc-4b0c-9576-aa5468b10cee
processing: none


BPIC14_f.xes.gz

original dataset:
https://doi.org/10.4121/uuid:86977bac-f874-49cf-8337-80f26bf5d2ef

processing: 
R. Conforti, et al. "Filtering out infrequent behavior from business process event 
logs", IEEE Trans. Knowl. Data Eng., vol. 29, no. 2, pp. 300-314, Feb. 2017.
https://doi.org/10.1109/TKDE.2016.2614680

"This technique uses a parameter called percentile, which refers to the percentile
of the distribution of the frequency of the arcs in thedirectly-follows graph 
extracted from the log, to automatically deter-mine the frequency threshold for the 
filtering. We set this parameter to its default value of 12.5 percent"


BPIC15_1f.xes.gz
BPIC15_2f.xes.gz
BPIC15_3f.xes.gz
BPIC15_4f.xes.gz
BPIC15_5f.xes.gz

original dataset:
https://doi.org/10.4121/uuid:31a308ef-c844-48da-948c-305d167a0ec1

processing: 
R. Conforti, et al. "Filtering out infrequent behavior from business process event 
logs", IEEE Trans. Knowl. Data Eng., vol. 29, no. 2, pp. 300-314, Feb. 2017.
https://doi.org/10.1109/TKDE.2016.2614680

"This technique uses a parameter called percentile, which refers to the percentile
of the distribution of the frequency of the arcs in thedirectly-follows graph 
extracted from the log, to automatically deter-mine the frequency threshold for the 
filtering. We set this parameter to its default value of 12.5 percent"


BPIC17_f.xes.gz

original dataset:
https://doi.org/10.4121/uuid:5f3067df-f10b-45da-b98b-86ae4c7a310b

processing: 
R. Conforti, et al. "Filtering out infrequent behavior from business process event 
logs", IEEE Trans. Knowl. Data Eng., vol. 29, no. 2, pp. 300-314, Feb. 2017.
https://doi.org/10.1109/TKDE.2016.2614680

"This technique uses a parameter called percentile, which refers to the percentile
of the distribution of the frequency of the arcs in thedirectly-follows graph 
extracted from the log, to automatically deter-mine the frequency threshold for the 
filtering. We set this parameter to its default value of 12.5 percent"


RTFMP.xes.gz

original dataset:
https://doi.org/10.4121/uuid:270fd440-1057-4fb9-89a9-b699b47990f5
processing: none


SEPSIS.xes.gz

original dataset:
https://doi.org/10.4121/uuid:915d2bfb-7e84-49ad-a286-dc35f063a460
processing: none