Ot & Sien, a dataset to help the development of object detection in children's book illustrations

DOI:10.4121/d1f3ca5c-f1e4-48f5-9a04-0564572d2b9c.v1
The DOI displayed above is for this specific version of this dataset, which is currently the latest. Newer versions may be published in the future. For a link that will always point to the latest version, please use
DOI: 10.4121/d1f3ca5c-f1e4-48f5-9a04-0564572d2b9c
Datacite citation style:
Wang, Haoran; Khademi, Seyran (2023): Ot & Sien, a dataset to help the development of object detection in children's book illustrations. Version 1. 4TU.ResearchData. dataset. https://doi.org/10.4121/d1f3ca5c-f1e4-48f5-9a04-0564572d2b9c.v1
Other citation styles (APA, Harvard, MLA, Vancouver, Chicago, IEEE) available at Datacite

Dataset

The original dataset is Ot & Sien Dataset (https://lab.kb.nl/dataset/ot-sien-dataset). We corrected mistakes and made it ML-ready.

The purpose of this dataset is to help the development of automatic visual object detection in children's book illustrations. The properties of our dataset are summarized as: 

  • The dataset consists of illustrations rather than standard photos. 
  • 1452 images with 8241 objects (5.7 per image) are annotated including the category and bounding boxes.
  • All images are resized to 416 x 416 with black fitting edges to adapt to the training procedure.
  • The dataset follows a natural long-tail property, with some object categories being rare.
  • The dataset has imbalanced categories.

History

  • 2023-06-23 first online, published, posted

Publisher

4TU.ResearchData

Format

*.jpg, *.txt, *.yaml

Organizations

TU Delft, Faculty of Electrical Engineering, Mathematics & Computer Science (EEMCS)

DATA

Files (2)