Published on 30 July 2021 |

Version 4.1

HTRCatalogs: Dataset for historical catalogs HTR and Segmentation

View Dataset
Janes, Juliette;Joyeux-Prunel, Beatrice;Gabay, Simon

Description

This release contains 465 xml files, and their corresponding images from a large corpus of 19th, 20th and 21th exhibition catalogs, manuscripts'fair catalogs and directories. The new catalogs added here were created using the HTR and segmentation models accessible in the repository. It includes a csv file describing the xml files and various tools to create a training dataset: differents bash scripts, a python programm to divide the xml files into testing, training and evaluation dataset and several fixed tests. A xsl transformation sheet is also accessible to delete the Entry and EntryEnd zones from the xml files in order to have a SegmOnto-like dataset. The xml files has been corrected since the 4.0 release thanks to the addition of a github action (SegmOntoKraken).

Citations (0)

Mentions (0)

Metrics

Dataset Index

0.3

FAIR Score

13%

Citations

0

Mentions

0

Metrics Over Time

Publication Details

DOI

Publisher

Zenodo

Assigned Domain

Subfield

Computer Vision and Pattern Recognition

Field

Computer Science

Domain

Physical Sciences

Confidence Score

95%

Source

Open Alex

Keywords

ocrhtrcatalogs

Normalization Factors

FT

15.38

CTw

1.00

MTw

1.00