Ot & Sien, a dataset to help the development of object detection in children's book illustrations

doi:10.4121/d1f3ca5c-f1e4-48f5-9a04-0564572d2b9c.v1
The doi above is for this specific version of this dataset, which is currently the latest. Newer versions may be published in the future. For a link that will always point to the latest version, please use
doi: 10.4121/d1f3ca5c-f1e4-48f5-9a04-0564572d2b9c
Datacite citation style:
Wang, Haoran; Khademi, Seyran (2023): Ot & Sien, a dataset to help the development of object detection in children's book illustrations. Version 1. 4TU.ResearchData. dataset. https://doi.org/10.4121/d1f3ca5c-f1e4-48f5-9a04-0564572d2b9c.v1
Other citation styles (APA, Harvard, MLA, Vancouver, Chicago, IEEE) available at Datacite
Dataset

The original dataset is Ot & Sien Dataset (https://lab.kb.nl/dataset/ot-sien-dataset). We corrected mistakes and made it ML-ready.

The purpose of this dataset is to help the development of automatic visual object detection in children's book illustrations. The properties of our dataset are summarized as: 

  • The dataset consists of illustrations rather than standard photos. 
  • 1452 images with 8241 objects (5.7 per image) are annotated including the category and bounding boxes.
  • All images are resized to 416 x 416 with black fitting edges to adapt to the training procedure.
  • The dataset follows a natural long-tail property, with some object categories being rare.
  • The dataset has imbalanced categories.
history
  • 2023-06-23 first online, published, posted
publisher
4TU.ResearchData
format
*.jpg, *.txt, *.yaml
organizations
TU Delft, Faculty of Electrical Engineering, Mathematics & Computer Science (EEMCS)

DATA

files (2)