Monk Cuper Set (MCS) for benchmarking historical document image binarization

Sheng He; Lambert Schomaker; Zhenwei Shi

doi:10.5281/zenodo.4767809

Published July 1, 2019 | Version 2019

Dataset Open

Monk Cuper Set (MCS) for benchmarking historical document image binarization

1. University of Groningen

****************************
Monk Cuper Set (MCS): used for document binarization and enhancement

Images are collected on the book Cuper on Monk system.
Monkweb.nl

Ground-Truth are labelled by Zhenwei Shi.

If you use this data set, please cite the paper:

DeepOtsu: Document Enhancement and Binarization using Iterative Deep Learning.
Pattern Recognition. https://www.sciencedirect.com/science/article/abs/pii/S0031320319300330

File names:

Cuper-*.png : the original input image
GT-Cuper-*.png : the ground-truth labeled by Zhenwei Shi
________________________________________________________________________

The document images were collected by Jetze Touber of the University of Gent in his study:

Jetze Touber (2016)
De actualiteit van de klassieken bij Gisbert Cuper Weegblad :
Nieuwsblad van de Vereniging De Waag. 8(1). p.6-7, http://hdl.handle.net/1854/LU-8511610

This 17th century Cuper-Braun collection in the Monk system, concerns a series of European scholarly letters by different writers.
They write in a multitude of languages, switching from Latin to French, interjecting the text with phrases in Greek and Hebrew.

Jetze Touber collected the images using his 2014/2015 Apple iPhone,scholarly letters from Johannes Braunius and Gisbert Cuper in the archives. The scans contain chromatic aberration, focus variation (on top of the traditional problems with historical manuscripts).

Contains: 31 .png images and their corresponding ground truth ('GT') binarized versions.

Files

MCSset.zip

Files (78.1 MB)

Name	Size	Download all
MCSset.zip md5:60c0b29b596d95ead3ba502ba44c3a20	78.1 MB	Preview Download

	All versions	This version
Views	404	404
Downloads	73	73
Data volume	7.3 GB	7.3 GB

Monk Cuper Set (MCS) for benchmarking historical document image binarization

Authors/Creators

Description

Files

MCSset.zip

Files (78.1 MB)