Helsinki Corpus of Swahili 2.0 (HCS 2.0)

Suomeksi


Currently available versions of this resource

ShortnameName and metadataLicenseLocationCiteResource group and helpApplyPublication yearSupport level
ShortnameName and metadataLicenseLocationCiteResource group and helpApplyPublication yearSupport level

Upcoming versions of this resource

These resource versions are not yet available in the Language Bank of Finland.

ShortnameName and metadataLicenseFormatsSupport levelContact PersonResource group and helpLocationOther information
ShortnameName and metadataLicenseFormatsSupport levelContact PersonResource group and helpLocationOther information

Resource information

Helsinki Corpus of Swahili 2.0 is available for research purposes in Kielipankki – the Language Bank of Finland. The corpus contains about 25 million words of written text, and it is available in two formats. The annotated version contains morphological and syntactic annotation as well as glosses in English. The not annotated version contains plain text. The corpus text was randomly shuffled document-internally. The sentence order is the same in both corpus versions.

For more information on the corpus please see: https://www.kielipankki.fi/corpora/hcs2/

License and access

  • Some versions of this resource are available publicly (PUB), whereas others might require you to log in as an academic user (ACA).
  • Click on the license image to see the resource-specific license text.
  • Some versions of this resource are available in the computing environment (see column ’Location’). icon-question-circle

 


This resource group page has a Persistent Identifier: http://urn.fi/urn:nbn:fi:lb-2014032624

Last modified on 2025-11-07