Ot & Sien, a dataset to help the development of object detection in children's book illustrations
doi:10.4121/d1f3ca5c-f1e4-48f5-9a04-0564572d2b9c.v1
The doi above is for this specific version of this dataset, which is currently the latest. Newer versions may be published in the future.
For a link that will always point to the latest version, please use
doi: 10.4121/d1f3ca5c-f1e4-48f5-9a04-0564572d2b9c
doi: 10.4121/d1f3ca5c-f1e4-48f5-9a04-0564572d2b9c
Datacite citation style:
Wang, Haoran; Khademi, Seyran (2023): Ot & Sien, a dataset to help the development of object detection in children's book illustrations. Version 1. 4TU.ResearchData. dataset. https://doi.org/10.4121/d1f3ca5c-f1e4-48f5-9a04-0564572d2b9c.v1
Other citation styles (APA, Harvard, MLA, Vancouver, Chicago, IEEE) available at Datacite
Dataset
The original dataset is Ot & Sien Dataset (https://lab.kb.nl/dataset/ot-sien-dataset). We corrected mistakes and made it ML-ready.
The purpose of this dataset is to help the development of automatic visual object detection in children's book illustrations. The properties of our dataset are summarized as:
- The dataset consists of illustrations rather than standard photos.
- 1452 images with 8241 objects (5.7 per image) are annotated including the category and bounding boxes.
- All images are resized to 416 x 416 with black fitting edges to adapt to the training procedure.
- The dataset follows a natural long-tail property, with some object categories being rare.
- The dataset has imbalanced categories.
history
- 2023-06-23 first online, published, posted
publisher
4TU.ResearchData
format
*.jpg, *.txt, *.yaml
organizations
TU Delft, Faculty of Electrical Engineering, Mathematics & Computer Science (EEMCS)
DATA
files (2)
- 636 bytesMD5:
949309785626c6ab3410410e0a5b181e
README.txt - 50,851,803 bytesMD5:
353044c8233dee9a31639af660c9e0ac
1.0_Children_Books.zip -
download all files (zip)
50,852,439 bytes unzipped