The Corpus of Sociolinguistic Variation in the Province of Satakunta (sapu)

Suomeksi


Currently available versions of this resource

ShortnameName and metadataLicenseLocationCiteResource group and helpApplyPublication yearSupport level
ShortnameName and metadataLicenseLocationCiteResource group and helpApplyPublication yearSupport level

Upcoming versions of this resource

These resource versions are not yet available in the Language Bank of Finland.

ShortnameName and metadataLicenseFormatsSupport levelContact PersonResource group and helpLocationOther information
ShortnameName and metadataLicenseFormatsSupport levelContact PersonResource group and helpLocationOther information

Resource information

The Sapu corpus is a sociolinguistic corpus representing the spoken language of Satakunta in the 21st century (and, more broadly, contemporary Finnish spoken language), which has been lemmatized and annotated morphologically and syntactically. The corpus contains samples of the spoken language of six localities (Rauma, Honkilahti, Luvia, Pori, Ulvila-Nakkila, Kokemäki).  The records consist of audio recordings and the corresponding transcripts.

Further information about the recordings

Further details of each version of the resource are maintained in the metadata record, findable via the persistent identifier (see the link at the resource title).

A general description of the corpus can be found, e.g., in the following publication:
Kurki, Tommi, Huhtala, Atte, Koivunen, Tomi & Mäkitalo, Nelli (2022). Satakuntalaisuus puheessa -korpus ja siitä tehtyjä synkretismihavaintoja: Syncretism in Colloquial Finnish – Observations of the Satakunta corpus. AFinLA-teema, 14, pp. 103-134. doi:10.30660/afinla.111247

License and access

  • All current versions of this resource require you to apply for individual access rights (RES).
  • Click on the license image to see the resource-specific license text.
  • All or most versions of this resource contain personal data (license condition +PRIV). The license includes data protection terms and conditions that you must follow. If processing personal data, maintain a public Privacy Notice regarding your project and provide the link to the Language Bank of Finland, see instructions.)

 


This page has a persistent identifier: http://urn.fi/urn:nbn:fi:lb-2025091121

Last modified on 2025-11-10