This corpus consists of the OCR results of the material published from 1875 to 1920 in the corpus of publications digitized by the National Library of Finland. Note that parts of the resource are copyright-protected. The full corpus, as FIN-CLARIN has it, is organized in eleven branches named arc01, ..., arc11. Each document is stored as a zip archive containing scanned image files in different resolutions, and the OCR results as XML documents. This distribution has the same structure but contains only the OCR results. Each of the distribution files arc01.zip, ..., arc11.zip contains the material extracted from one branch of the full corpus. The distribution file "klk_fi_1875_1920.zip" contains all 11 branches in one archive. NB: The Finnish acronym for this corpus used to be "Digilib", but the acronym "klk" and the short name klk-fi-1920-dl are recommended instead. Due to this also the folder and file names have been changed. FIN-CLARIN Kielipankki http://www.kielipankki.fi